I’m making a small library for analyzing graphs, and I ran into an issue with memory allocation / error handling. Basically it boils down to “how do I call .toOwnedSlice() on two separate ArrayLists?”, with @matklad’s great article as context for why this isn’t simple:
Here’s my main type (relevant parts only)
pub fn DirectedGraph(Node: type) type {
return struct {
const Link = struct { usize, usize };
nodes: []Node, // contents should never change
links: []Link, // contents should never change
pub fn fromOwnedSlices(nodes: []Node, links: []Link) @This() {
return .{
.nodes = nodes,
.links = links,
};
}
pub fn deinit(self: *@This(), gpa: std.mem.Allocator) void {
gpa.free(self.nodes);
gpa.free(self.links);
self.* = undefined;
}
// ...
};
}
Now, I want to be able to build these graphs incrementally (e.g. by reading from stdin), so I made a helper type:
/// used to build a DirectedGraph of unknown size
pub fn DirectedGraphBuilder(Node: type) type {
return struct {
const Link = struct { usize, usize };
nodes: std.ArrayList(Node),
links: std.ArrayList(Link),
const empty: @This() = .{
.nodes = .empty,
.links = .empty,
};
// BUGGY
pub fn toFrozen_v10(self: *@This(), gpa: std.mem.Allocator) !DirectedGraph(Node) {
const nodes = try self.nodes.toOwnedSlice(gpa);
const links = try self.links.toOwnedSlice(gpa);
return .fromOwnedSlices(nodes,links);
}
pub fn deinit(self: *@This(), gpa: std.mem.Allocator) void {
self.nodes.deinit(gpa);
self.links.deinit(gpa);
self.* = undefined;
}
// pub fn addNode(...) !void { ... }
// pub fn addLink(...) !void { ... }
};
}
toFrozen_v10 is buggy: if the second try fails, the const nodes slice will leak, and calling deinit() later won’t help.
So… how about this?
// BUGGY
pub fn toFrozen_v20(self: *@This(), gpa: std.mem.Allocator) !DirectedGraph(Node) {
const nodes = try self.nodes.toOwnedSlice(gpa);
errdefer gpa.free(nodes);
const links = try self.links.toOwnedSlice(gpa);
errdefer gpa.free(links); // (this is a no-op, but the symmetry is nice)
errdefer comptime unreachable;
return .fromOwnedSlices(nodes,links);
}
Still no good: No leak this time, but if the second try fails, the builder will be left in an invalid state (empty self.nodes, non-empty self.links)
Okay, let’s do the “reserve first” strategy:
// works?
pub fn toFrozen_v30(self: *@This(), gpa: std.mem.Allocator) !DirectedGraph(Node) {
const nodes_opt = gpa.remap(self.nodes.allocatedSlice(), self.nodes.items.len);
const nodes = if (nodes_opt) |nodes|
nodes
else
try gpa.alloc(Node, self.nodes.items.len);
errdefer { if (nodes_opt==null) gpa.free(nodes); }
const links_opt = gpa.remap(self.links.allocatedSlice(), self.links.items.len);
const links = if (links_opt) |links|
links
else
try gpa.alloc(Link, self.links.items.len);
errdefer { if (links_opt==null) gpa.free(links); }
errdefer comptime unreachable;
if (nodes_opt) |_| {
self.nodes = .empty;
} else {
@memcpy(nodes, self.nodes.items);
self.nodes.clearAndFree(gpa);
}
if (links_opt) |_| {
self.links = .empty;
} else {
@memcpy(links, self.links.items);
self.links.clearAndFree(gpa);
}
return .fromOwnedSlices(nodes,links);
}
Well… I think it finally works (no error will leak memory or put this data structure into an invalid state, I think) but yikes, that did not go well. We wanted to just write v10, but ended up needing v30 instead to properly handle errors.
I basically had to inline each .toOwnedList() so that I could interleave them, to reserve all of the memory in advance. I wonder if the API of toOwnedList() should be reconsidered somehow: it’s a bit awkward that it can error, and that it does two separate things (remap, and reallocate if the remap fails).
This would be much nicer:
// BUGGY
pub fn toFrozen_v40(self: *@This(), gpa: std.mem.Allocator) DirectedGraph(Node) {
self.nodes.shrinkAndFree(gpa, self.nodes.items.len);
self.links.shrinkAndFree(gpa, self.links.items.len);
const nodes = self.nodes.toOwnedSlice(gpa) catch unreachable;
const links = self.links.toOwnedSlice(gpa) catch unreachable;
return .fromOwnedSlices(nodes,links);
}
but this doesn’t work either: shrinkAndFree() claims to reduce the allocated capacity to the given length, but it gives up if the remap and reallocation both fail, so when toOwnedSlice() runs, it hits the same remap and allocation failures, triggering unreachable.
If we made a fn shrinkAndFreeForRealPlease(...) !void that errored if it wasn’t able to shrink to exactly the requested length, then this might be nice:
// IMAGINARY, AND BAD
pub fn toFrozen_v50(self: *@This(), gpa: std.mem.Allocator) !DirectedGraph(Node) {
try self.nodes.shrinkAndFreeForRealPlease(gpa, self.nodes.items.len);
try self.links.shrinkAndFreeForRealPlease(gpa, self.links.items.len);
errdefer comptime unreachable;
const nodes = self.nodes.allocatedSlice();
const links = self.links.allocatedSlice();
self.nodes = .empty;
self.links = .empty;
return .fromOwnedSlices(nodes,links);
}
but this doesn’t work either: there are good reasons why an allocator might not be able to shrink an allocation.
I may turn this into a blog post but I wanted to get some feedback first. Thoughts? Can you do better than my toFrozen_v30? Can you imagine a better version of the ArrayList.toOwnedSlice() API, or is it fine as-is? Should I change my graph API somehow? I might remove the split between DirectedGraph and DirectGraphBuilder, and instead add a pointer_stability lock, like HashMapUnmanaged. But that feels like giving up after just two try statements, surely it’s possible to deal with multiple erroring functions without running away screaming?