How to Free Memory that was used to create a struct or tagged union from a function

Clinton-Nmereole · December 10, 2023, 11:46pm

If you had a struct user that was composed of a name and an email, and you had a function makeuser(user_name,email, allocator). This function returns a user but with duplicates of user_name and email.

const User = struct {
    name: [] const u8,
    email: [] const u8,
}

pub fn makeUser( user_name, email, allocator) User {
    return .{
        .name = try allocator.dupe(u8, user_name),
        .email = try allocator.dupe(u8, email)
    }
}

And I had another struct say UserGroup that had an arraylist of users as a field but needed to get rid of them after performing an operation, how would I make sure that the name and email field of each user is freed.

const UserGroup = struct {
    group: std.Arraylist(User),
    allocator: std.mem.allocator,
   
    pub fn readUser(self: *UserGroup, name, email) void {
        const new_user = makeUser(name,email, self.allocator);
        try self.group.append(new_user);
    }
   
   pub fn freeGroup(self: *UserGroup) void {
    //Here I want to delete everything in the group, so that the Arraylist is empty but I know that the users created with makeUser since they are using dupe allocated memory on the heap for those values. How do I from this function free that memory as well to prevent memory leaks.
   }
    
}

I don’t know if this is possible or if what I’m trying to do is not idiomatic Zig. If the latter, what would be the correct way to go about it.

kristoff · December 11, 2023, 12:50am

for (group.items) |user| {
   allocator.free(user.name);
   allocator.free(user.email);
}
group.deinit();

Memory allocation strategies aren’t really a thing that can be idiomatic or not. Any given program will have a preferred strategy and that’s something specific to the problem it’s designed to solve.

If you want to learn more about different memory management techniques, take a look at arenas (std.heap.ArenaAllocator in the stdlib).

Clinton-Nmereole · December 11, 2023, 4:13am

Would this also work with a StringHashMap instead of an ArrayList. If group was a StringHashMap?
So,

var iter = self.group.iterator()
while (iter.next()) |kv| {
    allocator.free(kv.value_ptr.*.name);
}

I tried it on a tagged union and got an error

/home/clinton/zig-linux-x86_64-0.12.0-dev.80+014d88ef6/lib/std/mem/Allocator.zig:307:45: error: access of union field 'Pointer' while field 'Union' is 
active
    const Slice = @typeInfo(@TypeOf(memory)).Pointer;

dude_the_builder · December 11, 2023, 12:20pm

What I’ve found to be a useful convention in these types of cases is to let the struct itself manage its own cleanup in a deinit method. So

const User = struct {
    allocator: std.mem.Allocator, // save this for deinit
    ...
    fn deinit(self: *User) void {
        self.allocator.free(self.name);
        self.allocator.free(self.email);
    }
}

Now whenever it’s time to cleanup a user, just call its deinit method.

ktz_alias · December 11, 2023, 1:46pm

Recently, I’ve used std.json.Parsed.
For specified example,

const User = struct {
    name: [] const u8,
    email: [] const u8,
};

pub fn makeUser( user_name: []const u8, email: []const u8, allocator: std.mem.Allocator) !std.json.Parsed(User) {
    var arena = try allocator.create(std.heap.ArenaAllocator);
    arena.* = std.heap.ArenaAllocator.init(allocator);
    
    var managed_allocator = arena.allocator();

    return .{
        .arena = arena,
        .value = .{
            .name = try managed_allocator.dupe(u8, user_name),
            .email = try managed_allocator.dupe(u8, email)
        }
    };
}

pub fn main() !void {
    const user_profile = try makeUser("foo", "foo@example.com", std.heap.page_allocator);
    defer user_profile.deinit();
}

Clinton-Nmereole · December 11, 2023, 3:36pm

I have run a few test with this with both Hashmaps and ArrayLists and specifically using Hashmaps, there are instances of when it returns the error

thread 821459 panic: Invalid free
/home/clinton/zig-linux-x86_64-0.12.0-dev.80+014d88ef6/lib/std/heap/general_purpose_allocator.zig:609:21: 0x23419e in freeLarge (main)
                    @panic("Invalid free");
                    ^

Edit: Allocator used was GPA

Clinton-Nmereole · December 11, 2023, 3:42pm

This might be the solution, I’ll run a couple of tests and let you know.

Clinton-Nmereole · December 11, 2023, 4:14pm

This doesn’t work with Tagged Unions, but is excellent for structs. My use case leans more towards tagged unions though.

chung-leong · December 11, 2023, 7:45pm

I don’t think that’s quite right. The responsibility of freeing memory lies with the owner of the allocator. The struct shouldn’t free itself.

Sze · December 12, 2023, 6:21pm

I think a deinit function like this makes sense in combination with the makeUser function, you have one way to create them and one way to deinit them. When you program needs to create users in multiple different ways, then I would agree with you that at that point its better to pass that responsibility on like you said.

However I would only store the allocator for big manager like structs and for smaller things I would pass the allocator to the deinit function.

Validark · December 12, 2023, 9:21pm

In my opinion, objects that are created at the same time, together, which should be freed at the same time, should probably share the same allocation unless you are optimizing your memory access pattern for, e.g. just iterating over names. I wrote this on my phone but this is how I’d do what you’re trying to do.

const User = struct {
    name_start: u32,
    email_start: u32,
    email_len: u32,

   fn name(self: @This(), memory: []const u8) []const u8 {
       return memory[self.name_start..][0..self.email_start];
   }

   fn email(self: @This(), memory: []const u8) []const u8 {
       return memory[self.email_start..self.email_len];
   }
}

const UserGroup = struct {
    group: std.ArraylistUnmanaged(User),
    str_buffer: std.ArrayListUnmanaged(u8),
    allocator: std.mem.allocator,
   
    pub fn readUser(self: *UserGroup, name: []const u8, email: []const u8) !void {
        errdefer self.deinit();
        var user: User = undefined;
        user.name_start = std.math.cast(u32, self.str_buffer.items.len) orelse return error.OutOfMemory;
        try self.str_buffer.appendSlice(self.allocator, name);
        user.email_start = std.math.cast(u32, self.str_buffer.items.len) orelse return error.OutOfMemory;
        try self.str_buffer.appendSlice(self.allocator, email);
        user.email_len = std.math.cast(u32, email.len) orelse return error.OutOfMemory;
        try self.group.append(self.allocator, user);
    }
   
   pub fn deinit(self: *UserGroup) void {
       self.str_buffer.deinit();
       self.group.deinit();
   }
}

chung-leong · December 12, 2023, 9:50pm

I was objecting to keeping a reference to the allocator in the struct, thereby allowing memory to be freed without intervention from the owner of the allocator. That’s very dodgy. There no guarantee the allocator still exists even. Passing the allocator again is the way to go.

jmc · December 12, 2023, 11:53pm

Passing the allocator again doesn’t achieve much but expose the user of the interface to the possibility that a different allocator is passed instead by mistake.

I’m not sure I follow your train of thought with the “the struct shouldn’t itself” – using the same wording, why would it be OK then for the struct to “allocate itself” when initializing?

I think you should see it more as “the struct exposes ways to allocate/deallocate the memory it needs, at the request of the user”.

chung-leong · December 13, 2023, 12:50am

Why are you assuming that a different allocator is necessarily wrong? It’s entirely possible for functions to receive different allocators during different phases. A wrapper allocator could be been used, for instance, during allocation but not freeing.

When a function gets an allocator, it’s allocating memory on behalf of the caller, using the method chosen by the caller. The struct is not allocating itself since there is no self at this point. The caller is just invoking a helper in the struct’s namespace. For freeing, it can also ask a helper in the struct’s namespace. And again, that helper would be performing the operation on behalf its caller.

Sze · December 13, 2023, 1:21am

Sure there are many possible designs, but in the end there is only one which is actually written down in code, without seeing specific code anything is possible, I don’t think it makes sense to be overly strict/critical about “how the code should be written” when we haven’t seen the specific code.

I don’t think there are objective truths here, it seems more like tradeoffs and questions of taste. User A wants ArrayList and user B wants ArrayListUnmanaged.

jmc · December 13, 2023, 2:02pm

That sounds like a contrived example. The wrapper allocator could just modify the (re)allocation code paths and leave the deallocation code paths untouched, achieving the same even if “cached” by the object itself.

There are emerging “patterns” that derive from the normal use of a programming language, and caching the allocator seems to be one of them; the Zig standard library itself does this in many cases, e.g. see here and here, just to grab a couple at random. Not doing so can obviously be a valid design choice, if that’s what your use case requires, but it doesn’t mean that the day-to-day usage pattern should be changing as a result.

Besides, as the Zig zen says, “Memory is a resource”. Would you normally apply the same treatment to other resources? In Zig you open a std.fs.File object from a std.fs.Dir object: would you pass it a std.fs.Dir again when closing the file? The file knows about its own handle and can perform the resource release on its own. The same happens on most other resources.

tl;dr: what Sze said

gonzo · December 13, 2023, 6:42pm

I also follow the standard library practice of passing in an allocator when creating (init) an object, and remembering it inside the object. That way, it is guaranteed that destroying the object (deinit) will use the same allocator to release any memory allocated internally. Using a different allocator to release that memory would lead to all sorts of problems.

mscott9437 · December 13, 2023, 7:22pm

The ArrayList/MultiArrayList (maybe some others) also as the toOwnedSlice() method. This will convert your ArrayList into a regular slice without copying anything. You can then let the slice live through to the end of the current block, so calling deinit is not necessary. If you can pass the owned slice as a reference, then you will not need the allocator at this point.

AndrewCodeDev · December 14, 2023, 1:15am

Just to nitpick, you do need the same allocator because when it comes time to free the owned slice, you have to use the same allocator for that. You may have meant to to say that, just wanted to clear that up

/// The caller owns the returned memory. Empties this ArrayList,
/// Its capacity is cleared, making deinit() safe but unnecessary to call.
pub fn toOwnedSlice(self: *Self) Allocator.Error!Slice {
    const allocator = self.allocator;

    const old_memory = self.allocatedSlice();
    if (allocator.resize(old_memory, self.items.len)) {
        const result = self.items;
        self.* = init(allocator);
        return result;
    }

    const new_memory = try allocator.alignedAlloc(T, alignment, self.items.len);
    @memcpy(new_memory, self.items);
    @memset(self.items, undefined);
    self.clearAndFree();
    return new_memory;
}

@mscott9437 is correct, though - you can see that it tries to just resize it to the size of the items length to wipe out the extra capacity… if it can’t do that, it copies itself to the necessary slice, wipes itself out and then hands the copy back.

mscott9437 · December 14, 2023, 1:44am

Thanks for pointing that out. I think I had some wires crossed earlier. I was trying to mention that if you pass an ArrayList or slice as a reference, you can keep the allocator in your main block. But yeah the main point I was trying to make was that you won’t need to call deinit(). If I don’t need the ArrayList functions anymore, I can just call toOwnedSlice() then and there to store it in a variable, which also effectively deinits the ArrayList. Not sure what other implication this might have, but at least it’s stated in the docs that calling deinit() isn’t necessary when using toOwnedSlice().