Freeing memory allocated in a loop

I’m converting a bunch of data in structs from one type to another; more or less a functional map operation. I have a list of users from the network (NetworkUser) with a first and last name. I loop over them to create a plain User who only has a full_name.

Problem is, I’m getting a memory leak when I format the first and last names into a full name. Since I’m using an allocator to format inside the loop, I’m not sure how I can free it:

// zig 0.14.1
const std = @import("std");
const Allocator = std.mem.Allocator;

const NetworkPerson = struct { first_name: []const u8, last_name: []const u8 };
const Person = struct { full_name: []const u8 };

fn genPeople(allocator: Allocator, network_peeps: []NetworkPerson) ![]Person {
    var people = try std.ArrayListUnmanaged(Person).initCapacity(allocator, network_peeps.len);
    for (network_peeps) |np| {
        const full_name = try std.fmt.allocPrint(
            allocator,
            "{s} {s}",
            .{ np.first_name, np.last_name },
        );
        try people.append(allocator, Person{ .full_name = full_name });
    }

    return people.toOwnedSlice(allocator);
}

test "generates people from first and last names" {
    const allocator = std.testing.allocator;
    const network_peeps: [4]NetworkPerson = .{
        NetworkPerson{ .first_name = "John", .last_name = "Lennon" },
        NetworkPerson{ .first_name = "Paul", .last_name = "McCartney" },
        NetworkPerson{ .first_name = "George", .last_name = "Harrison" },
        NetworkPerson{ .first_name = "Ringo", .last_name = "Starr" },
    };

    const people = try genPeople(
        allocator,
        @constCast(network_peeps[0..]),
    );
    defer allocator.free(people);

    try std.testing.expectEqualStrings(people[0].full_name, "John Lennon");
    try std.testing.expectEqualStrings(people[1].full_name, "Paul McCartney");
    try std.testing.expectEqualStrings(people[2].full_name, "George Harrison");
    try std.testing.expectEqualStrings(people[3].full_name, "Ringo Starr");
}

I’ve tried deferring the free (learned from Allocating memory in a loop) but that doesn’t work because toOwnedSlice empties out the ArrayList:

         );
         try people.append(allocator, Person{ .full_name = full_name });
     }
+    defer {
+        for (people.items) |person| {
+            // nothing here to free
+            allocator.free(person.full_name);
+        }
+    }

     return people.toOwnedSlice(allocator);
 }

I’ve also tried returning people.items instead with and without the defer block with the same results.

If I free the full_name inside the loop then I get a segmentation fault.

I might be able to pass in an ArrayList to the function, but I don’t love that the caller has to know implementation details about what the function does.

How can I allocate memory in a loop and free it?

That field either has to be allocated (and freed), or not if it is a static string. If you’re returning the struct to the caller, there is no way around the caller being aware of this. Some possible solutions:

  • always allocate that string if that struct is returned to the user, and document that it must be freed by its owner

  • add another field indicating whether it is allocated, or similarly, put the field in a tagged union with variants for allocated or not; this allows allocated and non-allocated strings to be used

  • use two fields for first and last name in the Person struct, and provide a method that formats them into a full name (and doc that the returned string must be freed by the caller)

More generally, you’ll have to deal with this question of ownership and alloc/free for all your fields that may be allocated, not just this one.

1 Like
// zig 0.14.1
const std = @import("std");
const Allocator = std.mem.Allocator;

const NetworkPerson = struct { first_name: []const u8, last_name: []const u8 };
const Person = struct { full_name: []const u8 };

fn genPeople(allocator: Allocator, network_peeps: []NetworkPerson) ![]Person {
    var people = try std.ArrayListUnmanaged(Person).initCapacity(allocator, network_peeps.len);
    for (network_peeps) |np| {
        const full_name = try std.fmt.allocPrint(
            allocator,
            "{s} {s}",
            .{ np.first_name, np.last_name },
        );
        try people.append(allocator, Person{ .full_name = full_name });
    }

    return people.toOwnedSlice(allocator);
}

test "generates people from first and last names" {
    const allocator = std.testing.allocator;
    const network_peeps: [4]NetworkPerson = .{
        NetworkPerson{ .first_name = "John", .last_name = "Lennon" },
        NetworkPerson{ .first_name = "Paul", .last_name = "McCartney" },
        NetworkPerson{ .first_name = "George", .last_name = "Harrison" },
        NetworkPerson{ .first_name = "Ringo", .last_name = "Starr" },
    };

    const people = try genPeople(
        allocator,
        @constCast(network_peeps[0..]),
    );
    defer {
        for (people) |person| {
            allocator.free(person.full_name);
        }
        allocator.free(people);
    }

    try std.testing.expectEqualStrings(people[0].full_name, "John Lennon");
    try std.testing.expectEqualStrings(people[1].full_name, "Paul McCartney");
    try std.testing.expectEqualStrings(people[2].full_name, "George Harrison");
    try std.testing.expectEqualStrings(people[3].full_name, "Ringo Starr");
}

I think the subtly you missed is you are returning a slice ([]Person) and not an ArrayList. So, you don’t need people.items, but instead just people.

Edit: oops, I misread the original post. I thought your diff with the defer was for the unit test, not the original source code :sweat_smile: let me elaborate.

The std.testing.allocator is configured to report memory leaks by default, so that explains why you see errors when you forgot to free your memory in your tests.

Back to your original post, I believe the “correct” solution is to use an errdefer like so:

fn genPeople(allocator: Allocator, network_peeps: []NetworkPerson) ![]Person {
    var people = try std.ArrayListUnmanaged(Person).initCapacity(allocator, network_peeps.len);
    errdefer {
        for (people) |person| {
            allocator.free(person.full_name);
        }
        allocator.free(people);
    }
    for (network_peeps) |np| {
        const full_name = try std.fmt.allocPrint(
            allocator,
            "{s} {s}",
            .{ np.first_name, np.last_name },
        );
        try people.append(allocator, Person{ .full_name = full_name });
    }

    return try people.toOwnedSlice(allocator);
}

That way, your memory will be freed if and only if your functions returns an error (in this particular case, an error.OutOfMemory. Note that I added a try on the last line of the function. While this feels pedantic, it’s actually necessary in case toOwnedSlice fails because otherwise the function will miss the de-allocations made by allocPrint.

1 Like

try is not necessary. See that this test passes without any leaks:

fn testLeak(alloc: std.mem.Allocator) ![]u8 {
    const foo = try alloc.alloc(u8, 1);
    errdefer alloc.free(foo);
    return error.Unsuccessful;
}

test "leak test" {
    const alloc = std.testing.allocator;
    try std.testing.expectError(error.Unsuccessful, testLeak(alloc));
}

errdefer expressions are always evaluated when an error is returned, regardless of if the error is returned via a try or via a return. try <expr> is exactly equivalent to <expr> catch |err| return err.

2 Likes

TIL! Thank you!

Thanks all for the thinking so far. I’m digesting and considering my options here.

Do you know if there’s a way to associate the memory with the ArrayList so when I free it I also free all it’s associated memory? I tried something like this:

for (network_peeps) |np| {
    const full_name = try std.fmt.allocPrint(...);
    defer allocator.free(full_name);
    const owned_full_name = allocator.dupe(u8, full_name);
    try people.append(allocator, Person{ .full_name = owned_full_name });
}

…and only called free on the ArrayList but that also causes a seg fault.

No, you do that manually in Zig. Normally Person would have a deinit method which is responsible for freeing any of its contents that are allocated. When you have a slice or ArrayList of Person values, you’d iterate over it and call that deinit method, before freeing the slice or ArrayList.

…and only called free on the ArrayList but that also causes a seg fault.

Not sure what you mean.

You’re doing an unnecessary allocation there, since you can simply assign full_name to the field in Person (don’t free it, and skip the the use of owned_full_name). Unless you need to hold onto a copy of the name for some other reason.

PS. Another pattern is to create a PersonList struct with a single field – the slice or ArrayList of Person values. Then PersonList can have a deinit method that frees everything as I described above. But no matter what, someone has to write the code that iterates over the values and frees what needs to be freed – neither Zig itself, nor the collection types in the std library, will do this for you.

Ah, sorry. I mean that when I access people[0].full_name in the test, it seg faults. Presumably because the memory is freed.

:light_bulb: Thanks for this idea. I’m going to try it.

I’m realizing what I’m asking for are patterns/idioms. I’m coming from high-level languages (ruby, js) so how to deal with memory is still a new challenge. These ideas are helpful.

2 Likes

There’s two insights here:

  1. You’re heap allocating both the names and the people. The caller needs a way to free both.

  2. Allocating lots of things in a loop leads to messy cleanup code.

I think @TheShinx317 got it right, but even with errdefer/defer doing a lot of work to make sure the frees happen in every code path, using them correctly is tricky as soon as your allocation pattern becomes any more complicated than “single allocations at the start of a scope”.

So. That’s what “good” memory management looks like in Zig.

Single large allocations in the start of scopes or functions that can be cleaned up with simple uses of defer/errdefer. Learning to design code that works like that can be pretty tricky, but it’s a very rewarding part of low level work IMO.

Compare the ArrayList-based implementation to something like this:

// zig 0.14.1
const std = @import("std");
const Allocator = std.mem.Allocator;

const NetworkPerson = struct {
    first_name: []const u8,
    last_name: []const u8,
};

const Person = struct {
    full_name: []const u8,
};

// Since both the names and the []Person slice need the same lifetime,
// it makes sense to group them in a struct
const People = struct {
    slice: []Person,
    name_buffer: []u8,

    // Returns the size of the smallest possible buffer that can contain
    // all the names of []NetworkPerson
    fn getNameBufferSize(
        network_peeps: []const NetworkPerson,
    ) usize {
        var buffer_size: usize = 0;
        for (network_peeps) |np| {
            // fmt.count returns the minimum buffer size required for fmt.print
            buffer_size += std.fmt.count(
                "{s} {s}",
                .{ np.first_name, np.last_name },
            );
        }
        return buffer_size;
    }

    pub fn initFromNetwork(
        allocator: std.mem.Allocator,
        network_peeps: []const NetworkPerson,
    ) !People {
        const content = try allocator.alloc(
            Person,
            network_peeps.len,
        );
        errdefer allocator.free(content);

        const name_buffer = try allocator.alloc(
            u8,
            getNameBufferSize(network_peeps),
        );

        // Now that we're done allocating, we can tell
        // the zig compiler to scream loudly if we have
        // unhandled errors past this point, so we don't
        // accidentally add more error paths that need
        // errdefer cleanup logic
        errdefer comptime unreachable;

        var index: usize = 0;
        for (network_peeps, content) |np, *p| {
            // bufPrint does two things:
            //    puts the printed string in the buffer
            //    returns a slice of the string
            p.full_name = std.fmt.bufPrint(
                name_buffer[index..],
                "{s} {s}",
                .{ np.first_name, np.last_name },
            ) catch |err| switch (err) {
                // getNameBufferSize makes sure the buffer can fit the names,
                // so the error case is impossible and we can tell the
                // compiler that
                error.NoSpaceLeft => unreachable,
            };

            index += p.full_name.len;
        }

        return .{ .name_buffer = name_buffer, .slice = content };
    }

    pub fn deinit(people: People, allocator: std.mem.Allocator) void {
        allocator.free(people.slice);
        allocator.free(people.name_buffer);
    }
};

test "generates people from first and last names" {
    const allocator = std.testing.allocator;
    const network_peeps: []const NetworkPerson = &.{
        .{ .first_name = "John", .last_name = "Lennon" },
        .{ .first_name = "Paul", .last_name = "McCartney" },
        .{ .first_name = "George", .last_name = "Harrison" },
        .{ .first_name = "Ringo", .last_name = "Starr" },
    };

    const people: People = try .initFromNetwork(
        allocator,
        network_peeps,
    );
    defer people.deinit(allocator);

    try std.testing.expectEqualStrings(
        people.slice[0].full_name,
        "John Lennon",
    );
    try std.testing.expectEqualStrings(
        people.slice[1].full_name,
        "Paul McCartney",
    );
    try std.testing.expectEqualStrings(
        people.slice[2].full_name,
        "George Harrison",
    );
    try std.testing.expectEqualStrings(
        people.slice[3].full_name,
        "Ringo Starr",
    );
}

  1. People encapsulates that you have two things that need freeing, and it has paired init and deinit functions. That way it’s easy for client code to see that you need to defer deinit to clean up.

  2. The error cleanup path in initFromNetwork is simple, since the complicated logic happens on a code path that isn’t allowed to fail, and this invariant is compiler-enforced.

  3. (Bonus) You’re allocating memory way less frequently in cases where network_peeps is large. Memory allocation is slow. Memory allocation can fail.

3 Likes

I’m realizing what I’m asking for are patterns/idioms. I’m coming from high-level languages (ruby, js) so how to deal with memory is still a new challenge. These ideas are helpful.

Look into the ArenaAllocator. It’s a very useful thing that transforms “lots of small allocations with lots of small frees” into “occasional large allocations with a single shared free” with no change to your application code.

(though, it might be better to practice doing without in the start to get a feel for memory allocation)

1 Like

I agree wholeheartedly! When I wrote my answer above, I held back on recommending arenas because, although they are the simpler solution, I didn’t want to deprive the OP of the opportunity to learn how/when to use errdefer vs defer.

@andrewrk and I actually had a conversation about this at the last Systems Distributed. While writing a toy parser by hand, I went through all the pain and suffering that is required to management memory this way. While I did eventually end up switching to arenas, I’m glad I still went through the exercise of using errdefer for all my allocations because it gave me a deeper appreciation for how powerful they can be!

5 Likes

Funny enough, I also wanted to avoid ArenaAllocator because it seems good for short-lived data like a small CLI, or potentially some data munging like this. But I’m hoping to work up to a GUI so I want to learn these fundamentals better, first.


I’m liking this deinit pattern!

const PersonList = struct {
    people: []Person,

    fn deinit(self: PersonList, allocator: Allocator) void {
        for (self.people) |person| {
            allocator.free(person.full_name);
        }
        allocator.free(self.people);
    }
};

Then wherever I return that PersonList, I need to call it’s deinit:

const person_list = try genPeople(
    allocator,
    @constCast(network_peeps[0..]),
);
defer person_list.deinit(allocator);

I like the abstraction here since the caller only needs to know to call deinit which fits in with other stdlib types. I may go further and make an init(allocator: Allocator) method to match the deinit and an append to match ArrayList.

Next up: trying to absorb @xjlqr’s post!

Full code for reference
const std = @import("std");
const Allocator = std.mem.Allocator;

const NetworkPerson = struct { first_name: []const u8, last_name: []const u8 };
const Person = struct { full_name: []const u8 };
const PersonList = struct {
    people: []Person,

    fn deinit(self: PersonList, allocator: Allocator) void {
        for (self.people) |person| {
            allocator.free(person.full_name);
        }
        allocator.free(self.people);
    }
};

fn genPeople(allocator: Allocator, network_peeps: []NetworkPerson) !PersonList {
    var people = try std.ArrayListUnmanaged(Person).initCapacity(allocator, network_peeps.len);
    for (network_peeps) |np| {
        const full_name = try std.fmt.allocPrint(
            allocator,
            "{s} {s}",
            .{ np.first_name, np.last_name },
        );
        try people.append(allocator, Person{ .full_name = full_name });
    }

    return PersonList{ .people = try people.toOwnedSlice(allocator) };
}

test "generates people from first and last names" {
    const allocator = std.testing.allocator;
    const network_peeps: [4]NetworkPerson = .{
        NetworkPerson{ .first_name = "John", .last_name = "Lennon" },
        NetworkPerson{ .first_name = "Paul", .last_name = "McCartney" },
        NetworkPerson{ .first_name = "George", .last_name = "Harrison" },
        NetworkPerson{ .first_name = "Ringo", .last_name = "Starr" },
    };

    const person_list = try genPeople(
        allocator,
        @constCast(network_peeps[0..]),
    );
    defer person_list.deinit(allocator);

    try std.testing.expectEqualStrings(person_list.people[0].full_name, "John Lennon");
    try std.testing.expectEqualStrings(person_list.people[1].full_name, "Paul McCartney");
    try std.testing.expectEqualStrings(person_list.people[2].full_name, "George Harrison");
    try std.testing.expectEqualStrings(person_list.people[3].full_name, "Ringo Starr");
}
1 Like

Next up: trying to absorb @xjlqr’s post!

Recommended reading:

Edit while i’m nitpicking:

The @constCast is a pretty big flaw here:

const person_list = try genPeople(
    allocator,
    @constCast(network_peeps[0..]),
);
defer person_list.deinit(allocator);

You get a compile error without it because you’re passing a constant to a function accepting a mutable slice of data.

@constCast is telling the compiler “I know the underlying data here is mutable even though the type system says it is const”. It does make the code compile here, but it isn’t a valid use of @constCast.

If you ever changed genPeople to modify the []NetworkPerson slice, which is a legal operation from the function signature, your program would crash.

The way you should resolve that instead is changing the function signature:

- fn genPeople(allocator: Allocator, network_peeps: []NetworkPerson)
+ fn genPeople(allocator: Allocator, network_peeps: []const NetworkPerson)

In general, @constCast is a code smell. There are situations where it is needed, but those situations are almost exclusively when working with library code you don’t own.

2 Likes

When the lifetime of container is the same with the lifetime of its elements, using an arena allocator for both the container and its elements is a much better solution.

In other words, freeing memory in deinit using for is a red flag that an arena allocator is a better fit.

2 Likes

This is kinda blowing my mind: Freeing memory allocated in a loop - #9 by xjlqr

I don’t think I would have considered putting all the names in a single buffer so the data is easier to free later, but it sure does work! It’s like a bunch of “vertical slices” of your data; e.g. one per struct property in the list. Instead of a “horizontal slice” where each struct holds everything.

I could see this cascading nicely, too, in a big graph of structs where each struct has a deinit that calls it’s children’s deinit, although maybe that is a smell as @dimdin points out.

I also got it working by storing an arena allocator instead of the name_buffer, and then calling deinit on that in the People.deinit. That satisfies the mental model I came in with, but I’m definitely glad I learned this instead. I’ve got more options now :grinning_face:

Thanks for pointing that out. I didn’t like using that but I was unsure how to get a list of structs for the test. I need to go back and review my understanding of const, but that’s probably for another thread.

Thanks everyone for all the advice!

2 Likes

You’ve hit the nail on the head here.

It’s related to the “struct of arrays” design which is used all over the place in modern high performance code. Search terms “Data Oriented Design” or “SoA / AoS”, if you want to learn more. That is exactly a sort of vertical-horizontal difference in how you store the data.

Good luck on your Zig journey! :slight_smile:

2 Likes