Allocating memory in a loop

n11n · December 15, 2024, 5:27am

Hi there,

I’m new to Zig (coming from Python/Go) and I am experiencing some perplexing behaviour, which I assume is based on a misunderstanding somewhere on my part. This is some code which highlights the behaviour:

fn testing(allocator: std.mem.Allocator) !void {
    const m1 = try allocator.alloc(u8, 100);
    defer allocator.free(m1);
    std.debug.print("m1: {d}\n", .{@intFromPtr(&m1)});

    const m2 = try allocator.alloc(u8, 100);
    defer allocator.free(m2);
    std.debug.print("m2: {d}\n", .{@intFromPtr(&m2)});

    const m3 = try allocator.alloc(u8, 100);
    defer allocator.free(m3);
    std.debug.print("m3: {d}\n", .{@intFromPtr(&m3)});

    std.debug.print("\n", .{});

    for (0..5) |i| {
        var memory = try allocator.alloc(u8, 100);
        defer allocator.free(memory);

        std.debug.print("p{d}: {d}\n", .{ i, @intFromPtr(&memory) });
    }
}

test testing {
    const allocator = std.testing.allocator;
    try testing(allocator);
}

Outputs:

$ zig test src/main.zig
m1: 140726576320016
m2: 140726576320112
m3: 140726576320224

p0: 140726576320280
p1: 140726576320280
p2: 140726576320280
p3: 140726576320280
p4: 140726576320280
All 1 tests passed.

The first 3 lines printed, show m1, m2 and m3 all having different pointers.
The 5 last lines however all show the same pointer being used, which to me is unexpected, I would have thought I would see 5 additionally different pointers for each of the pN lines.

The actual use case which started all this, is me creating a StringHashMap, with the values being an ArrayList. The intention is to go through a set of input values which are pairs, and if the key is present in the hashmap, append to its list, otherwise create a new ArrayList with the single value in it (essentially a defaultdict(list) in Python3 FWIW). The behaviour outlined above was causing this implementation to behave in the same way.

I have been stuck on this for days now and feel as though I have a misunderstanding at some fundamental level, or there is some behaviour within Zig that I am unaware of at play here.

Thanks very much in advance for any assistance

kivikakk · December 15, 2024, 6:14am

There’s two things here.

The key thing here is that defer runs at the end of the enclosing block, not necessarily the enclosing function. If we unroll your loop, it’ll look something like this:

var memory0 = try allocator.alloc(u8, 100);
std.debug.print("p{d}: {d}\n", .{ 0, @intFromPtr(&memory0) });
allocator.free(memory0);

var memory1 = try allocator.alloc(u8, 100);
std.debug.print("p{d}: {d}\n", .{ 1, @intFromPtr(&memory1) });
allocator.free(memory1);

var memory2 = try allocator.alloc(u8, 100);
std.debug.print("p{d}: {d}\n", .{ 2, @intFromPtr(&memory2) });
allocator.free(memory2);

// etc.

If it could work the way you initially expected, it’d imply that each defer in the loop actually appended to some runtime-resizable array of deferred executions, since there’s no way to know at compile-time how many such defers might get reached before the end of the function. In reality, they get executed whenever the block is left, which is easily done at compile time (since looping constructs take care of themselves, as shown above).

The other thing is, by using &m1, &m2 etc. in your print statements, you’re actually printing the address of the pointer on the stack, and not the address of the pointer itself. m1 as returned by allocator.alloc is in fact a []u8, which is a slice — a pointer and a length. &m1, then, is the address of that slice as stored in the stack. Even if we fix the allocations themselves, you’ll still get the same output, because in the loop memory will almost certainly have the same place in the function stack.

(Notice how your m1 and m2 in your debug output are less than 100 bytes apart, despite m1’s allocation being 100 bytes long!)

To have your example clean up all 5 after the loop, and to show the actual allocated addresses on the heap (and not the locations of the pointers themselves on the stack), you’ll need something like this:

const std = @import("std");

fn testing(allocator: std.mem.Allocator) !void {
    const m1 = try allocator.alloc(u8, 100);
    defer allocator.free(m1);
    std.debug.print("m1: {*}\n", .{m1});

    const m2 = try allocator.alloc(u8, 100);
    defer allocator.free(m2);
    std.debug.print("m2: {*}\n", .{m2});

    const m3 = try allocator.alloc(u8, 100);
    defer allocator.free(m3);
    std.debug.print("m3: {*}\n", .{m3});

    std.debug.print("\n", .{});

    var memories = std.ArrayListUnmanaged([]u8){};
    defer {
        for (memories.items) |m|
            allocator.free(m);
        memories.deinit(allocator);
    }

    for (0..5) |i| {
        const memory = try allocator.alloc(u8, 100);
        try memories.append(allocator, memory);

        std.debug.print("p{d}: {*}\n", .{ i, memory });
    }
}

test testing {
    const allocator = std.testing.allocator;
    try testing(allocator);
}

Output:

m1: u8@1027d4000
m2: u8@1027d4080
m3: u8@1027d4100

p0: u8@1027d4180
p1: u8@1027d4280
p2: u8@1027d4300
p3: u8@1027d4380
p4: u8@1027d4400
OK
All 1 tests passed.

We store each allocated address in an ArrayList so that we can clean them all up on function exit.

mnemnion · December 15, 2024, 6:49pm

This is an important point to emphasize if someone has Go experience, because it doesn’t work that way in Go at all.

defer and errdefer are scoped to a block, they have nothing to do with functions at all. Except in the sense that a function has a main block, such that use of a defer statement in that block applies to the function, in that sense.

Pointers are just a memory location, they aren’t a substitute for object identity (which is not a thing in Zig). Your code is reusing pointers, because you’re allocating inside the for loop block, and freeing at the end of it, and the allocator happens to reuse the memory. That is not something you can rely on!

With some exceptions, a pointer is not a good choice for a HashMap key. It’s not perfectly clear to me whether that’s what your actual code is doing, or what it is that you need it to do.

Given your background, you’re used to languages which have object identity, so let’s start there: Zig does not have object identity, and the integer value of pointers is not a substitute for it.

Welcome to Ziggit @n11n

n11n · December 18, 2024, 6:57am

Thank you both for taking the time to respond! Both of you gave some very valuable nuggets of information which have help me figure this out (after a few more days still). Ultimately, I think my example didn’t do a great job of describing the problem I was really having, apologies there.

@mnemnion ultimately your comment about memory not being a substitute for object identity was at the core of my issue. My code is using a StringHashMap, so the keys are strings, not pointers, but the values were previously intended as a ‘pointer to an ArrayList’, which, is not quite the right way to go about things, and hence I was getting similar behaviour described above.

Thanks again, really appreciate the help (and the welcome)!