Better pattern to deallocate all memory when returning error

Zemogus · June 12, 2024, 12:05pm

Hi all, I’m writing some reinforcement learning code that essentially boils down to this:

pub fn init(allocator: Allocator, population_size: usize, options: Options) !Self {
    var population = try allocator.alloc(Genome, population_size);
    var i: usize = 0;
    while (i < population_size) : (i += 1) {
        population[i] = try Genome.init(allocator);
    }

    return .{
        .allocator = allocator,
        .population = population,
        .options = options,
    };
}

After adding some errdefers to free any half-initialised memory, I can’t help but find the deeply indented errdefer block looking needlessly ugly. One alone isn’t so bad, but I have code with multiple similar blocks and they take up so much space that the actually important code has become a little hard to read. There’s also the fear that I might forget an errdefer somewhere while editing one of the more complex functions.

pub fn init(allocator: Allocator, population_size: usize, options: Options) !Self {
    var population = std.ArrayList(Genome).init(allocator);
    try population.ensureTotalCapacityPrecise(population_size);
    errdefer {
        for (0..population.items.len) |i| {
            population.items[i].deinit(allocator);
        }
        population.deinit();
    }

    for (0..population_size) |_| {
        population.appendAssumeCapacity(try Genome.init(allocator));
    }

    return .{
        .allocator = allocator,
        .population = try population.toOwnedSlice(),
        .options = options,
    };
}

My first thought was to use a function-scoped arena allocator, but it turns out that std.heap.ArenaAllocator stores the size and next pointer at the start of each allocation, so I couldn’t figure out how to only deallocate them while keeping the rest of the allocation (in the case of the happy path).

Does anyone know a better pattern I could use here?

dimdin · June 12, 2024, 2:05pm

One way is to alloc(Genome, population_size) this returns a slice []Genome.
The advantage is that you only have a single alloc/free.
If you don’t want to change the population size you are done, otherwise you can convert it to ArrayList using fromOwnedSlice.

EDIT: the code becomes:

pub fn init(allocator: Allocator, population_size: usize, options: Options) !Self {
    return .{
        .allocator = allocator,
        .population = try allocator.alloc(Genome, population_size),
        .options = options,
    };
}

and you need the following deinit:

pub fn deinit(self: Self) void {
    self.allocator.free(self.population);
}

dude_the_builder · June 12, 2024, 2:11pm

You could encapsulate all that code into a struct with its own housekeeping methods:

const Population = struct {
  allocator: std.mem.Allocator,
  list: std.ArrayLIst(Genome),

  pub fn init(allocator: std.mem.Allocator, size: usize) !Population {
    var self = Population{
      .allocator = allocator,
      .list = try std.ArrayList(Genome).initCapacity(allocator, size),
    };
    errdefer self.deinit();

    for (0..size) |_| {
        self.list.appendAssumeCapacity(try Genome.init(allocator));
    }

    return self;
  }

  pub fn deinit(self: *Population) void {
    for (self.list.items) |*item| item.deinit(self.allocator);
    self.list.deinit();
  }
};

pub fn init(allocator: Allocator, population_size: usize, options: Options) !Self {
    return .{
        .allocator = allocator,
        .population = try Population.init(allocator, population_size),
        .options = options,
    };
}

If you’re making many of these lists with different item types, you can make this a generic function that takes the item type and returns the struct.

dimdin · June 12, 2024, 4:18pm

EDIT: the following is wrong. Proper cleanup is provided in deinit.

In try Genome.init the code might leak previous allocations before the failed one.
This is why I am proposing to allocate everything at once.
What is more interesting is the final return .population = try population.toOwnedSlice(), in the initial code, that converts the ArrayList back to a slice.

IntegratedQuantum · June 12, 2024, 6:07pm

There is one simple trick to make for loops like this less ugly.
Instead of iterating through the indices you can directly iterate over elements like this:

for(population.items) |genome| { // Or |*genome| if you need a mutable reference
    genome.deinit();
}

This would even make it quite comfortable to put this into a single line:

for(population.items) |genome| genome.deinit();

And honestly I think at this point it wouldn’t even look that bad if you separate the errdefer block (note that this means that the order of statements needs to be reversed):

errdefer population.deinit();
errdefer for(population.items) |genome| genome.deinit();

I personally would even replace errdefer population.deinit() with defer population.deinit().
This makes functionally no difference, but it’s a good habit to always add a defer deinit on the next line for anything that is inited on the stack.

var population = std.ArrayList(Genome).init(allocator);
defer population.deinit();
try population.ensureTotalCapacityPrecise(population_size);
errdefer for(population.items) |genome| genome.deinit();

dude_the_builder · June 12, 2024, 7:16pm

But wouldn’t the errdefer self.deinit() clean those up?

dimdin · June 12, 2024, 7:38pm

Oh! You are right. I missed the deinit change.
Your solution is adding the “one level of indirection” that solves all the problems

dude_the_builder · June 12, 2024, 7:48pm

Actually, I’m not sure if it would be better to substitute the ArrayList with your suggested one-shot allocation and then add an allocated counter variable to keep track of how many items need deinit before freeing the whole slice on error.

Sze · June 12, 2024, 7:58pm

I don’t think that is necessary, I guess I would change it to use an ArrayListUnmanaged because the data structure already stores the allocator.

Then the only difference would be that the arraylist still stores an explicit capacity, but that could become useful if you want to implement other functions that add new elements, so that might even be useful.

So I wouldn’t really change that unless there is a good reason for that.

dimdin · June 12, 2024, 8:00pm

These details depends on the actual requirements.

Everyone added some useful patterns to: “Better pattern to deallocate all memory when returning error”.

Use one defer/errdefer for each resource.
Use a single allocation for many identical resources.
Encapsulate allocation in a struct
Ask in ziggit for help

Zemogus · June 13, 2024, 4:38am

That looks much neater! It’s so simple I almost feel embarrassed .

Zemogus · June 13, 2024, 4:42am

I’m marking this as the solution as this is exactly what I asked for. All the deconstruction logic abstracted somewhere else where I don’t have to look at or think about it.