I’m learning zig, coming from higher level languages (go, mostly). I’m trying to get a clear understanding of stack allocations, and their lifetime attached to their stack frame. I think I get the basic principle: if I declare a var inside a function, the value is allocated on the stack, attached to the frame of the current function. When the function exists, the stack is discarded, with all memory attached.
Now, here is an example:
const std = @import("std");
pub fn main() void {
var val: u64 = 23;
// val is allocated in main stackframe
std.debug.print("{} @ {}\n", .{ val, &val });
// prints 23 @ u64@7ffc5047be10
const b = Box.init(val);
// passing val to init (by address or value, zig compiler is making the choice)
std.debug.print("{} @ {}\n", .{ b.value.*, b.value });
// prints 23 @ u64@7ffc5047bdf8
// I thought here I would get u64@7ffc5047bde0, and some random value because of the dangling pointer, but apparently not?
}
const Box = struct {
value: *const u64,
fn init(v: u64) Box {
std.debug.print("{} @ {}\n", .{ v, &v });
// prints 23 @ u64@7ffc5047bde0 not the same address as val
// v is allocated inside in init stackframe?
return Box{
.value = &v, // storing the address of a stack allocated value, should result in a dangling ptr?
};
}
};
So obviously there is something I don’t understand … can anyone explain why b.value does not point to an invalid address? What am I missing here?
In this line of code, you’re making a copy of the parameter v
fn init(v: u64) Box { ...
This is only reserved memory while the function call is still open. Once you leave the scope of the function, that memory can be repurposed. So by taking a pointer to it .value = &v, you’re now referencing memory that will be repurposed.
So yes, that will be a dangling pointer. Here’s the thing though - the address may still be valid, but what lives at that address has no guarantees anymore.
More importantly though, you’re printing the address the pointer contains. You aren’t printing the value of what is being pointed to. To print the value of what the pointer is actually pointing to, you need to dereference it: b.value.* (the star operator dereferences the value).
I’m going to add an addendum here because I think it may be helpful. Remember, pointers are just fundamentally an integer - it keeps track of a numerical address. So let’s write some pseudo code…
val: int = 5; // has value 5 and assume it lives at address 12345
ptr: int = 12345; // has value 12345, assume it lives at 123XX
The variable ptr is an integer that just so happens to contain a number that is equivalent to the address of the variable val. No matter what happens to val, ptr will still hold that number until we change it. We can print that number just fine… when we try to go get the memory that it’s assigned to is where we get the problem (the dereference operation). That memory may have been repurposed and now even exist outside of the memory segment that our computer has assigned to our program (aka, a segfault).
So, I got some of it right - v is discarded at the end of init, so &v is a dangling pointer. But there are still things mysterious.
Here is another example:
const std = @import("std");
const Box = struct {
value: *const u64,
fn one(v: u64) Box {
std.debug.print("address of parameter v {}\n", .{&v});
return .{
.value = &v,
};
}
fn two(v: u64) Box {
var tmp = v;
std.debug.print("address of var tmp {}\n", .{&tmp});
return .{
.value = &tmp,
};
}
};
pub fn main() void {
const one = Box.one(23);
std.debug.print("box one {}\n\n", .{one});
const two = Box.two(23);
std.debug.print("box two {}\n", .{two});
}
resulting in:
> zig run mem.zig
address of parameter v u64@7ffe80c80188
box one mem.Box{ .value = u64@7ffe80c80198 }
address of var tmp u64@7ffe80c80190
box two mem.Box{ .value = u64@7ffe80c80190 }
So here, when I’m using the address of the parameter directly inside of the Box, it gets changed. Whereas if I copy the value to tmp and use that address, it stays the same.
So I guess the compiler is doing something clever when using the address of a parameter in a function?
I played around with the example on godbolt, but I’m starting to think that RLS is coming into play here. What version of Zig are you on? I’ll play around with the example more later because that’s actually an interesting little difference you found there.
An integer argument like this is generally going to be passed in a register, so to take its “address” the compiler actually needs to copy it to somewhere on the stack. It seems that each time an address is required, a new copy is being put on the stack.
That would certainly make sense in this case. It also explains why the two assembly code snippets are the same because it would automatically promote it to the same address the OP is seeing. Just another reason to not take pointers to variables outside of their intended scope