Freeing memory on error

Here is a simple tree-like structure representing mathematical expressions. I made two helper functions to create expression easily with this syn

pub const Binop = enum { ADD, SUB, DIV, MUL, POW };

pub const Expr = union(enum) {
    integer: i128,
    binop: struct {
        op: Binop,
        lhs: *Expr,
        rhs: *Expr,
    },

    pub fn fromInteger(value: i128, allocator: std.mem.Allocator) std.mem.Allocator.Error!*Expr {
        const e = try allocator.create(Expr);
        e.* = Expr{ .integer = value };
        return e;
    }

    pub fn fromBinop(op: Binop, lhs: *Expr, rhs: *Expr, allocator: std.mem.Allocator) std.mem.Allocator.Error!*Expr {
        errdefer {
            lhs.deinit(allocator);
            rhs.deinit(allocator);
        }
        const e = try allocator.create(Expr);
        e.* = Expr{ .binop = .{ .op = op, .lhs = lhs, .rhs = rhs } };
        return e;
    }

    pub fn deinit(self: *Expr, allocator: std.mem.Allocator) void {
        defer allocator.destroy(self);
        switch (self.*) {
            .binop => |binop| {
                binop.lhs.deinit(allocator);
                binop.rhs.deinit(allocator);
            },
            .integer => {
                // Simple values don't have allocated children
            },
        }
    }
};

Let’s say I want to create the operation 1 + 5, i will write

const expr = try Expr.fromBinop(.ADD,
 try Expr.fromInteger(1, allocator),
 try Expr.fromInteger(5, allocator),
 allocator);
defer expr.deinit(allocator);

If the allocation in Expr.fromBinop fails the subexpressions will be deallocated thanks to the errdefer block and all is well.
But if the allocation in the second Expr.fromInteger fails the first Expr.fromInteger will leak memory :confused: .
So let’s get out these allocations from the arguments and handle possible failure:

const a = try Expr.fromInteger(1, allocator);
errdefer a.deinit(allocator);
const b = try Expr.fromInteger(5, allocator);
errdefer b.deinit(allocator);
const expr = try Expr.fromBinop(.ADD, a, b, allocator);
defer expr.deinit(allocator);

Now if b allocation fails a will be correctly freed.
But if expr allocation fails, a and b will be double-freed.

The problem here is:

  • The caller has to free a and b until its passed to Expr.fromBinop which takes ownership of the variables.
  • After giving ownership expr becomes responsible for freeing a and b.

The possible solutions:

  • Don’t care: if allocation fails children will leaks and its okayish
  • Allocate in a block which returns a and b destructure and pass them to Expr.fromBinop documenting that this function takes ownership (error prone and quite verbose for a common situation).
  • Don’t take ownership and copy the arguments (possibly very costly for large expression).

What would be the idiomatic way in zig ?

This blog post from @n0s4 may be what I was looking for.

Let’s reimplement my Expr structure according to this blog post:

pub const Expr = struct {
 nodes: std.ArrayList(Node),
 const Node = union {
    integer: i128,
    unop: struct {
        op: Unop,
        expr: NodeIndex,
    },
    binop: struct {
        op: Binop,
        lhs: NodeIndex,
        rhs: NodeIndex,
    },
  };
 const NodeIndex = enum(u32) { _ };
};

My problem now would be to be able to create a binary operation without:

  • copying two Expr into another one (otherwise i’d be better off cloning my initial Expr structure).
  • having invalid states.

And i don’t see how the indices method helps me to achieve this.
Whether I have to be in a state where a binary operation has null children whether on each binary operation creation i have to copy every children (hence 2^n array copies for a n-depth expression).

This is the correct solution if each Expr needs to be separately allocated (you need its pointer to be stable) and you’re using a global heap (the normal case). Another solution is to use an arena allocator, rather than a global allocator, for the Expr tree. Then you can avoid explicitly freeing each individual Expr, and deinit the arena to free the entire Expr tree.

1 Like

Sorry, I’m having trouble understanding the problem you’re describing. No copying is needed, since you’re referring to Expr by index. You can construct the binop with the indexes of lhs and rhs, so there would be no invalid state. I’m sure I’m missing something.

1 Like

No you’re kinda right, from a tree pov it is an invalid state: before constructing the binary operation the expression contains two nodes that aren’t related, which is not a tree.
But as I implemented it, I aknowledge that this is completely acceptable and necessary to avoid unnecessary copies and binary operations with null children.

So the indices method is completely valid for my use case, this way the nodes are owned by the expression from the beginning and there is no transfer of ownership anymore.

In conclusion, heads up to @n0s4 that’s a great article and thanks to @jumpnbrownweasel you helped me rubber duck debugging.

Another related blog post that may be relevant to this issue:

1 Like

Edit: This post is incorrect, see below.

This is the correct way to write it, and we should not release them in fromBinop; their errdefer should be executed where they are created.

I haven’t fully understood your question about the version without pointers. According to that article, you would also need a stack to record invalid NodeIndexes.

With this solution, if an error happens after expr allocation, a and b would be double-freed (once because of errdefer and once because of expr.deinit).

Oh! My bad, I overlooked that the expr will be dropped in the current scope rather than returned after transferring ownership. So here’s what we should do:

const expr = blk: {
    const a = try Expr.fromInteger(1, allocator);
    errdefer a.deinit(allocator);
    const b = try Expr.fromInteger(5, allocator);
    errdefer b.deinit(allocator);
    break :blk try Expr.fromBinop(.ADD, a, b, allocator);
};
defer expr.deinit(allocator);
3 Likes

From one viewpoint there is still transfer of ownership when the indexes are passed to the function that creates the binary expression. However, error handling is simpler if you deinit the entire ArrayList whenever an error occurs, e.g., while attempting to grow the list. This is the same as using an arena allocator in the sense that the entire expr tree is deallocated at once rather than deallocating individual nodes.

Note that with an ArrayList, if you obtain temporary pointers to the nodes, those pointers will be invalidated when the ArrayList grows, so you have to be careful about that. The arena allocator solution does not have this problem because pointers are stable until the arena is deinited. Just something to be aware of.

Yes, Reserve First is a good strategy to simplify error handling when you need to allocate and free individual Expr on the global heap. That’s a good point.

That’s a really good point, maybe enforcing the use of an arena for expression construction would be a better solution than the ArrayList one.
Thinking to what you said wouldn’t it a better API that addOne and co. return an index instead of a pointer since pointer aren’t stables in an array ?

1 Like

I think you should probably move nodes: std.ArrayList(Node) to a pub var nodes: std.ArrayList(Node) = .empty; declaration within the Expr namespace, and change the original nodes field to a starting index and length.
That way, you have no “real” memory ownership within the Expr struct’s fields.

EDIT: Actually this is pretty thoughtless since it just makes it way harder to free

Yes, they should definitely return an index because the caller will need that index when constructing larger expressions.

It depends. While the pointers are stable with an arena, you don’t get the other advantages in the indices-not-pointers article, so you have to decide whether those are important to you. Also, if you know the total size (or some reasonable max size) of the Expr tree, you can preallocate the ArrayList and then the pointers will be stable. Lots of options.

1 Like

I would like to suggest another solution that just simply circumvents the problem by changing the model a little bit. Why are you allocating expressions? Does it make sense to pass around an integer as a pointer? Instead I’d suggest to only allocate once you start actually putting them into a container:

    pub fn fromBinop(op: Binop, lhs: Expr, rhs: Expr, allocator: std.mem.Allocator) std.mem.Allocator.Error!*Expr {
        const lhsPtr = try allocator.create(Expr);
        errdefer allocator.free(lhsPtr);
        lhsPtr.* = lhs;
        const rhsPtr = try allocator.create(Expr);
        errdefer allocator.free(rhsPtr);
        rhsPtr.* = rhs;
        e.* = Expr{ .binop = .{ .op = op, .lhs = lhsPtr, .rhs = rhsPtr } };
        return e;
    }
...
const expr = try Expr.fromBinop(.ADD, .{ .integer = 1 }, .{ .integer = 5 }, allocator);

From there you could also do other stuff, like shuffle the data layout so there is only one *[2]Expr in the binop.