Assigning a pointer to temporary variables can leave the pointer in an invalid state after the temporary falls out-of-scope.
This subject is closely related to object lifetimes: Documentation - The Zig Programming Language
Example 1: Temporary variable within function scope
fn foo() *const usize {
// reserve memory on the stack for a usize called bar
var bar: usize = 42;
// once we exit this function, bar is out of scope,
// invalidating the address of the returned pointer.
return &bar;
}
Example 2: Temporary value from function parameter
fn foo(bar: usize) *const usize {
// note that bar is passed by value, not pointer. This means
// that bar will exist only within the function scope.
// once we exit this function, bar is out of scope,
// invalidating the address of the returned pointer.
return &bar;
}
Example 3: Invalid temporary from init
function
// let's imagine that a user wants a type that contains it's own
// memory arena. From there, they want to assign an allocator
// to it's internal arena to be used with other data structures
const Foo = struct {
const Self = @This();
arena: std.heap.ArenaAllocator,
arena_allocator: std.mem.Allocator,
// note that "Self" here means we are returning a new value
pub fn init(backing_allocator: std.mem.Allocator) Self {
// creating temporary arena with backing allocator
var tmp_arena = std.heap.ArenaAllocator.init(backing_allocator);
// this example is incorrect because the arena member variable
// is a copy and has a different address to the tmp_arena, so the
// allocator() call is to a different arena.
return Self {
.arena = tmp_arena,
.arena_allocator = tmp_arena.allocator()
};
}
};
One may believe that the problem was the tmp_arena
variable and try to solve the issue by assigning from one member variable to another. This is still incorrect.
const Foo = struct {
const Self = @This();
arena: std.heap.ArenaAllocator,
arena_allocator: std.mem.Allocator,
// note that "Self" here means we are returning a new value
pub fn init(backing_allocator: std.mem.Allocator) Self {
// In this example, the user is trying to connect the member
// variables to themselves to avoid the initial temporary arena.
var self = Self {
.arena = std.heap.ArenaAllocator.init(backing_allocator),
.arena_allocator = undefined,
};
// here the user tries to assign from another member variable
self.arena_allocator = self.arena.allocator();
// the self variable is still a temporary. The memory of self
// will go out of scope after we exit this function, causing the
// allocator's pointer to reference invalid memory.
return self;
}
};
Instead, here is one way to approach this that will leave everything in a valid state.
const Foo = struct {
const Self = @This();
// remove the arena_allocator member variable
allocator: std.mem.Allocator,
pub fn init(allocator: std.mem.Allocator) Self {
// copy parameter's pointer
return Self { .allocator = allocator };
}
};
// later...
// create an arena in the scope where it will be used
var arena = std.heap.ArenaAllocator.init(backing_allocator);
// pass a pointer from the arena into Foo's init function.
var foo = Foo.init(arena.allocator());
Here’s another approach where the init
function takes a pointer to an instance of the struct
to be initialized. This is an idiom seen often in C.
const std = @import("std");
const Foo = struct {
const Self = @This();
arena: std.heap.ArenaAllocator = undefined,
arena_allocator: std.mem.Allocator = undefined,
// We pass in a pointer to a mutable Foo. It could be
// on a stack frame higher up the call stack or on the
// heap.
pub fn init(
backing_allocator: std.mem.Allocator,
self: *Self,
) void {
// No problems here given that everything is placed
// in memory that outlives this function's scope.
self.arena = std.heap.ArenaAllocator.init(backing_allocator);
self.arena_allocator = self.arena.allocator();
// You can quickly confirm a pointer's address by
// using the `{*}` format specifier.
std.debug.print("init: {*} {*} {*}\n", .{
self, // already a pointer
&self.arena,
&self.arena_allocator,
});
}
pub fn deinit(self: *Self) void {
// Free any memory allocated by the arena.
self.arena.deinit();
}
};
pub fn main() !void {
// Let's use our old friend the GPA.
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
// Here Foo is instantiated in main's stack.
var foo = Foo{};
defer foo.deinit();
// Even if Foo is in main's temporary memory, this call
// is OK because there's no way Foo can become invalid
// before init returns. Foo "outlives" the call to
// init, or in other words, its lifetime is longer.
Foo.init(allocator, &foo);
// Let's confirm the addresses are the same.
std.debug.print("main: {*} {*} {*}\n", .{
&foo,
&foo.arena,
&foo.arena_allocator,
});
}
In a sample run, this produces the output (note output will differ between runs and machines):
init: main.Foo@16b69f030 heap.arena_allocator.ArenaAllocator@16b69f030 mem.Allocator@16b69f050
main: main.Foo@16b69f030 heap.arena_allocator.ArenaAllocator@16b69f030 mem.Allocator@16b69f050
Example 4: Slicing a copy of an array on stack
Freely adapted from this topic.
const std = @import("std");
const log = std.debug.print;
const ToyStr = struct {
const CAP: usize = 9;
buf: [CAP]u8 = undefined,
// note that self is passed by value
fn sliceImpl(self: ToyStr, from: usize, to: usize) []const u8 {
log("inside: {s}\n", .{self.buf[from .. to]});
return self.buf[from .. to];
}
};
pub fn main() !void {
var ts = ToyStr{};
@memcpy(ts.buf[0..], "aaabbbccc");
const s = ts.sliceImpl(0,6);
log("outside: {s}\n", .{s});
}
Against expectation, this program does not output “aaabbb” in the main()
function:
$ ./toy-str-fg
inside: aaabbb
outside: bb
We are passing an instance of ToyStr
by value, so sliceMeNice
works with the copy.
It makes a slice of the copy, but after returning from this function stack state is changed.
The slice still points to the same place in stack, but there is no more valid copy at that place.
To make the program work correctly, just pass an instance of ToyStr
by reference:
fn sliceMeNice(self: *ToyStr, from: usize, to: usize) []const u8 {
^