Optional segfault

Hello, I found a strange behavior that I cannot explain:

if (previous != null)
    const a: *State = previous.?;
    std.debug.print("{}\n", .{a.en_passant.file()}); // WORKS
    std.debug.print("{}\n", .{previous.?.en_passant.file()}); // SEGFAULT
    std.debug.print("{}\n", .{(previous.?).en_passant.file()}); // SEGFAULT
}

with the function

pub inline fn file(self: Square) File {
    const trunc: u3 = @truncate(@intFromEnum(self));
    return @enumFromInt(trunc);
}

outputs :

types.File.fa
Segmentation fault at address 0xffffffffffffffff

Is there a known bug related to this?

I tried to reproduce in a smaller example without success, please let me know if you want more details. Here is the direct link for the related code.

I’ve compiled you program and reproduced the segfault. I’ve looked at the disassembly, ran it in a debugger, checked the registers, but couldn’t figure out what it is.
So, my best guess, is that it’s eiter a miscompilation or you got a really well-hidden undefined behavior somewhere.

2 Likes

Maybe valgrind could help.

I think some code causes either state or previous to point to a temporary/local State variable that goes out of scope and thus pointers become garbage and start pointing to invalid memory addresses.
If valgrind and debugging doesn’t help, I would start printing and comparing pointer addresses, maybe adding more asserts.

Another thing that I would do (but might be unrelated) is to reduce the amount of default-initialized fields and instead use a named empty value with decl literals. And I would avoid using undefined for default values, instead set the value explicitly to undefined in places where it is valid. (To make sure that there is no place that forgets to set a field and lets it just default set to undefined)

2 Likes

Do you have a Stacktrace/Backtrace or could you add std.debug.dumpCurrentStackTrace before the crash?

Okay, never used this function, I saw it can be used with the null argument.

Not sure if this helps:

C:\Program Files\zig-windows-x86_64-0.14.0\lib\std\debug.zig:315:31: 0x7ff6f417c06b in dumpCurrentStackTrace (radiance.exe.obj)
        writeCurrentStackTrace(stderr, debug_info, io.tty.detectConfig(io.getStdErr()), start_addr) catch |err| {
                              ^
D:\Documents\dev\zig\radiancerecharged\src\position.zig:193:44: 0x7ff6f4206c24 in movePiece (radiance.exe.obj)
            std.debug.dumpCurrentStackTrace(null);
                                           ^
D:\Documents\dev\zig\radiancerecharged\src\search.zig:255:26: 0x7ff6f4222653 in abSearch__anon_27804 (radiance.exe.obj)
        try pos.movePiece(move, &s);
                         ^
D:\Documents\dev\zig\radiancerecharged\src\search.zig:282:38: 0x7ff6f422370f in abSearch__anon_27788 (radiance.exe.obj)
                score = -try abSearch(allocator, NodeType.pv, ss + 1, pos, eval, -beta, -alpha, current_depth - 1);
                                     ^
D:\Documents\dev\zig\radiancerecharged\src\search.zig:165:52: 0x7ff6f42251a5 in iterativeDeepening__anon_27689 (radiance.exe.obj)
            const score: types.Value = try abSearch(allocator, NodeType.root, ss, pos, eval, alpha, beta, current_depth);
                                                   ^
D:\Documents\dev\zig\radiancerecharged\src\interface.zig:399:53: 0x7ff6f4227f29 in cmd_go__anon_27359 (radiance.exe.obj)
                move = try search.iterativeDeepening(allocator, stdout, pos, limits, evaluate.evaluateTable);
                                                    ^
D:\Documents\dev\zig\radiancerecharged\src\interface.zig:147:23: 0x7ff6f422cf79 in loop__anon_26097 (radiance.exe.obj)
            if (cmd_go(allocator, stdout, &pos, &tokens, &states, &options)) {} else |err| {
                      ^
D:\Documents\dev\zig\radiancerecharged\src\main.zig:24:23: 0x7ff6f42303a4 in main (radiance.exe.obj)
    try interface.loop(std.heap.c_allocator, &stdin, &stdout);
                      ^
C:\Program Files\zig-windows-x86_64-0.14.0\lib\std\start.zig:631:28: 0x7ff6f42308a9 in main (radiance.exe.obj)
    return callMainWithArgs(@as(usize, @intCast(c_argc)), @as([*][*:0]u8, @ptrCast(c_argv)), envp);
                           ^
C:\Program Files\zig-windows-x86_64-0.14.0\lib\libc\mingw\crt\crtexe.c:266:0: 0x7ff6f425ebab in __tmainCRTStartup (crt2.obj)
    mainret = _tmain (argc, argv, envp);

C:\Program Files\zig-windows-x86_64-0.14.0\lib\libc\mingw\crt\crtexe.c:186:0: 0x7ff6f425ec0b in mainCRTStartup (crt2.obj)
  ret = __tmainCRTStartup ();

I tried a bunch of things but I am also not sure whats going on, I don’t have the custom lldb-fork set-up, I don’t know whether that would show more.

The debugger shows a zero dereference but if you use any kind of temporary variable before the .file() then the code just ends up working, so that also makes me think it may be some kind of miscompilation, when code that should be equivalent, works.

I guess we need someone who is more used to looking at the assembly and can spot if something is wrong there.

3 Likes

Not necessarily related, but is there a reason you’re using inline so often? inline forces the function to be inlined, but Zig can also decide to inline the function on a regular function call.

I don’t think I use it that much, it’s mostly for one line fonctions that I use it.
There is a mother function for a specific task that calls “tool” functions which may be inline. To me this makes sense even thought I’m not sure it’s the best coding pattern.
Often I try to avoid inlining too much as it is the compiler’s task

It looks like it is loading the wrong pointer address:

&a.en_passant = types.Square@7fff781d3360
&self.state.previous.en_passant = types.Square@40000000010101e

No idea why. It does seem like a miscompilation.

Changes to the logging statements I made:

diff --git a/src/position.zig b/src/position.zig
index 08f31ea..98c4d5d 100644
--- a/src/position.zig
+++ b/src/position.zig
@@ -188,12 +188,9 @@ pub const Position = struct {
         // Remove last en_passant
         if (self.state.previous != null and self.state.previous.?.en_passant != Square.none) {
             const a: *State = self.state.previous.?;
-            std.debug.print("{}\n", .{a.en_passant.file()});
-            std.debug.print("BEGIN of trace\n", .{});
-            std.debug.dumpCurrentStackTrace(null);
-            std.debug.print("END of trace\n", .{});
-            std.debug.print("{}\n", .{((self.state.previous).?).en_passant.file()});
-            std.debug.print("{}\n", .{self.state.previous.?.en_passant.file()});
+            std.debug.print("&a.en_passant = {*}\n", .{&a.en_passant});
+            std.debug.print("&self.state.previous.en_passant = {*}\n", .{&self.state.previous.?.en_passant});
+            std.debug.print("self.state.previous.en_passant.file() = {}\n", .{self.state.previous.?.en_passant.file()});
             // self.state.material_key ^= tables.hash_en_passant[self.state.previous.?.en_passant.file().index()];
         }

Figured out how to stop it crashing:

diff --git a/src/position.zig b/src/position.zig
index 08f31ea..c0812ab 100644
--- a/src/position.zig
+++ b/src/position.zig
@@ -48,7 +48,7 @@ const CastleInfo = enum(u4) {
     }
 };
 
-pub const State = packed struct {
+pub const State = struct {
     turn: Color = Color.white,
     castle_info: CastleInfo = CastleInfo.none,
     repetition: i7 = 0, // Zero if no repetition, x positive if happened once x half moves ago, negative indicates repetition

Still not sure why it was crashing, and it still seems like a miscompilation. However, I’m no longer surprised that it was crashing. Why are you putting a pointer into a packed struct? In general I would only put value or index types into a packed struct, otherwise I think packed structs are the wrong tool for the job.

2 Likes

Found a reduction:

const Square = enum(u8) {
    a,
    b,
    c,

    pub fn file(self: Square) u3 {
        const trunc: u3 = @truncate(@intFromEnum(self));
        return trunc;
    }
};

const State = packed struct {
    square: Square,
    size: u1 = 0,
    previous: ?*State,
};

pub fn main() !void {
    var root = State{
        .square = .a,
        .previous = null,
    };
    var prev = State{
        .square = .b,
        .previous = &root,
    };
    var self = State{
        .square = .c,
        .previous = &prev,
    };

    const a: *State = self.previous.?;
    std.debug.print("{}\n", .{a.square.file()});
    std.debug.print("{}\n", .{self.previous.?.square.file()});
}

const std = @import("std");
~/tmp/packed_struct_ptr> zig run packed_struct_ptr.zig
1
General protection exception (no address available)
/home/geemili/tmp/packed_struct_ptr/packed_struct_ptr.zig:34:58: 0x10dde37 in main (packed_struct_ptr)
    std.debug.print("{}\n", .{self.previous.?.square.file()});
                                                         ^
/home/geemili/code/zig/build-master/stage3/lib/zig/std/start.zig:656:37: 0x10ddd52 in posixCallMainAndExit (packed_struct_ptr)
            const result = root.main() catch |err| {
                                    ^
/home/geemili/code/zig/build-master/stage3/lib/zig/std/start.zig:271:5: 0x10dd92d in _start (packed_struct_ptr)
    asm volatile (switch (native_arch) {
    ^
???:?:?: 0x0 in ??? (???)

Removing the size field stops the segfaults:

--- packed_struct_ptr.zig	2025-03-21 19:36:00.149081800 -0600
+++ packed_struct_ptr_no_segfault.zig	2025-03-21 19:39:09.410131898 -0600
@@ -11,7 +11,6 @@
 
 const State = packed struct {
     square: Square,
-    size: u1 = 0,
     previous: ?*State,
 };
 
~/tmp/packed_struct_ptr> zig run packed_struct_ptr_no_segfault.zig
1
1

Seems like some kind of alignment issue, where the u1 makes it an invalid size. It reminds me of this issue, but I don’t think the problems are actually related.

4 Likes

Nice work! Truly a detective here. It seems worth to open an issue.

Well done!
And thank you, I’m using a lot of State objects so wanted to lower their memory footprint. I’ll be using packed struct for very specific ojects now, like for the Move one.

I let you create the ticket with the reduction if you think it’s worth it.

Yeah from the doc I knew I should avoid undefined, I’ll work on that too.

I went ahead and added a comment to an existing issue that seemed relevant: x86_64: packed struct with pointer field assigned from global gets mangled as function argument · Issue #23131 · ziglang/zig · GitHub

Edit: Also wanted to mention that limiting packed structs to more specific objects seems like a good move.

4 Likes