A bug that goes away when I std.debug.print()

Durobot · October 24, 2023, 12:52pm

I’ve been playing with interfaces as described in David Vanderson’s post, and I have decided to replace fn interface(self: *PickRandom) Interface with an instance of Interface stored in PickRandom, initialized in init().
Strangely, my code crashes in random ways, but if I add a line to pub fn pick(self: *const Interface) i32 that accesses self or self.impl (probably must access self), the bug goes away

Could somebody please help me understand why is this happening?

Here’s the code (yes, I’m probably doing it wrong in PickRandom’s fn init(), as what it returns is a copy of me, which means the pointer is wrong, but the question is about the magic of std.debug.print()):

const std = @import("std");

pub fn main() void
{
    var pick_random = PickRandom.init();
    ifaceUserFn(pick_random.iface);
}

pub fn ifaceUserFn(iface: Interface) void
{
    for (1..4) |i|
    {
        const p = iface.pick();
        std.debug.print("ifaceUserFn() call #{d} returned {d}\n", .{ i, p });
    }
}

const Interface = struct
{
    impl: *anyopaque, // pointer to the implementing struct (we don't know the type)
    pickFn: *const fn (*anyopaque) i32, // can call directly: iface.pickFn(iface.impl)

    pub fn pick(self: *const Interface) i32 // allows calling: iface.pick()
    {
        // Uncommenting the line below doesn't make the bug go away
        //std.debug.print("pick()\n", .{});
        // But uncommenting any other line does
        //std.debug.print("pick() - iface is {*}, iface.impl points to {*}\n",
        //                .{ self, self.impl });
        //std.debug.print("{*}\n", .{ self });
        const a = self.pickFn(self.impl);
        //std.debug.print("pick() - iface is {*}, iface.impl points to {*}\n",
        //                .{ self, self.impl });
        //std.debug.print("{*}\n", .{ self });
        return a;
    }
};

const PickRandom = struct
{
    rnd_gen: std.rand.DefaultPrng, // specific to PickRandom
    iface: Interface,

    fn init() PickRandom
    {
        var me = PickRandom
        {
            .rnd_gen = std.rand.DefaultPrng.init(0),
            .iface = undefined,
        };
        me.iface =
        .{
            .impl = @as(*anyopaque, @ptrCast(&me)),
            .pickFn = myPick,
        };
        std.debug.print("init() - new PickRandom = {*}, its iface.impl points to {*}\n",
                        .{ &me, me.iface.impl });
        return me;

    }

    fn myPick(self_void: *anyopaque) i32
    {
        // cast typeless impl pointer to ourselves
        var self = @as(*PickRandom, @ptrCast(@alignCast(self_void)));
        return self.rnd_gen.random().intRangeAtMost(i32, 10, 20);
    }
};

Errors I get if I don’t access self or self.impl in pub fn pick():

Segmentation fault at address 0x0
/home/archie/projects/zig-playground/iface_02b_stripped.zig:69:9: 0x2220f9 in myPick (iface_02b_stripped)
        return self.rnd_gen.random().intRangeAtMost(i32, 10, 20);
        ^
???:?:?: 0x6c422e76504b451 in ??? (???)
Unwind information for `???:0x6c422e76504b451` was not available, trace may be incomplete

Aborted (core dumped)

Durobot · October 24, 2023, 12:56pm

Another strange thing - if in pub fn pick()I print self after the call to self.pickFn(), I get

iface_02b_stripped.Interface@0

But if I print it both before and after the call to self.pickFn(), I get

iface_02b_stripped.Interface@7ffc40479048
iface_02b_stripped.Interface@7ffc40479048

What gives?!

Sze · October 24, 2023, 2:22pm

I think it is exactly that the implementation of init is wrong, it takes the address of a local variable the me instance that isn’t going to be valid, when you do that you get all kinds of strange behavior and it can change vastly based on small tweaks, causing the compiler to create different code.

You even sometimes can get code that first appears to work correctly and then you change some other unrelated code and it stops working, basically you enter the hell that is using pointers that may or may not be overridden by other things.

I am not terribly interested why the bug appears to go away if you do certain things, to me it seems you simply get lucky in not getting a segfault, but the code is still incorrect.

If you want a PickRandom instance that works properly its .iface.impl needs to be set to a pointer that points to itself not some other duplicate temporary variable.
If you want to stack allocate PickRandom you have to options:

Split the body of init into 2 steps init and setup init returns the instance with the interface undefined, you then call setup which then can create the interface with the correct pointer (because by that point the pointer to the instance is established)
Write a createPickRandom function that takes an allocator and uses it to create the PickRandom instance with a stable pointer, you then could use a FixedBufferAllocator to put it on the stack

LucasSantos91 · October 24, 2023, 2:24pm

You already found the bug. The pointer is pointing at garbage. Whatever is in the garbage will get picked up in later calls. When you call std.debug.print, you are placing a pointer in the stack that points to an address you have permission to read, so the program reads it without noticing it’s garbage. Without the call, the garbage probably has some other value which is not a valid address, so the program crashes.

Durobot · October 25, 2023, 8:37am

I’m not sure I understand this. The address in the pointer (address of me in PickRandom.init()'s stack frame) does not change, whether I pass it to a function (e.g. std.debug.print) or not. Obviously, PickRandom.init()'s stack frame doesn’t exist by the time Interface.pick is called.
So, probably, the same memory location is then re-used for std.debug.print()'s stack frame.

But on the other hand, by the time pickFn() gets called on this line:

const a = self.pickFn(self.impl);

std.debug.print() is done, so there’s no std.debug.print()'s stack frame either.

Even worse, my program does not crash if I pass self to std.debug.print() AFTER the call to self.pickFn()!

const a = self.pickFn(self.impl);
std.debug.print("{*}\n", .{ self });

init() - new PickRandom = iface_02b_stripped.PickRandom@7ffc6e0bff10, its iface.impl points to anyopaque@7ffc6e0bff10
iface_02b_stripped.Interface@6e789e6aa1b965f4
ifaceUserFn() call #1 returned 19
iface_02b_stripped.Interface@0
ifaceUserFn() call #2 returned 19
iface_02b_stripped.Interface@0
ifaceUserFn() call #3 returned 19

So I guess this behavior will remain a mystery, at least for now.

Durobot · October 25, 2023, 8:58am

I have changed my code to this (relevant snippets only):

const PickRandom = struct
{
    rnd_gen: std.rand.DefaultPrng,
    iface: Interface,

    fn init() PickRandom
    {
        return
        .{
            .rnd_gen = std.rand.DefaultPrng.init(0),
            .iface = undefined,
        };
    }

    fn initInterface(self: *PickRandom) void
    {
        self.iface.impl = self;
        self.iface.pickFn = &PickRandom.myPick;
    }
   . . .
};

And in main():

pub fn main() void
{
    var pick_random = PickRandom.init();
    pick_random.initInterface();
    ifaceUserFn(pick_random.iface);

However, I don’t think I like this approach, as you can’t copy pick_random, pass it as a parameter to a function, etc.

PickRandom.interface() found in the original post is much safer, if slightly slower (or is it?) - you have to call a function, it creates a structure on the stack every time and returns it to the caller.

Sze · October 25, 2023, 11:26am

Sure that is another version of doing it, it is also what the standard library does with for example ArrayList.writer.

My suggestion was for when you really want to save the interface in the struct like your example showed for some reason. Personally I think I would go for the not storing it way of simply returning the interface with separate method, that way you have the choice at the call site to keep reusing the instance there.

It also doesn’t necessarily make sense to put an interface that is meant for generic/dynamic access into a structure which then again requires you to know how to access that interface from the structure through ducktyping or whatever.

So it makes sense to keep the interface separate from the concrete type and just give the concrete type a method to get the interface, so that you don’t need to know the concrete type from that point onward.

If you don’t know why are you trying to cache it? You should only start caching stuff if you know it will bring a benefit, avoid needless work, etc., don’t overthink, when you really think it is a problem, but you aren’t sure, find a way to measure.

I think micro benchmarks aren’t terribly helpful, better is when you have 2 variants of a mid complexity program and compare them and what they are doing.

Durobot · October 25, 2023, 12:25pm

No, I actually don’t subscribe to “performance above all” idea.
Other things, like how easy or hard is it to introduce errors unwittingly, matter as well.
Which is why, like I said, I’d only use interface caching if the performance difference is big. And, like you said, you can’t tell before you test, but I was just playing with the concept.