What's the cost of initiating ArenaAllocator? Am I abusing it?

Hi, I have the following program (sorry for being long):

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    try getFilesInCurrentDir(allocator);
}

fn getPatterns(allocator: std.mem.Allocator, dir: std.fs.Dir) [][]const u8 {
    const file = dir.openFile(".gitignore", .{}) catch |err| {
        std.debug.print("Unable to open .gitignore: {s}\n", .{@typeName(@TypeOf(err))});
        return &.{};
    };
    defer file.close();

    const contents = file.readToEndAlloc(allocator, 1024 * 10) catch |err| {
        std.debug.print("Unable to read .gitignore: {s}\n", .{@typeName(@TypeOf(err))});
        return &.{};
    };

    var patterns = std.ArrayList([]const u8).init(allocator);
    var iter = std.mem.split(u8, contents, "\n");
    while (iter.next()) |pattern| {
        if (pattern.len == 0) continue;
        patterns.append(pattern) catch |err| {
            std.debug.print("Error appending pattern '{s}': {s}\n", .{ pattern, @typeName(@TypeOf(err)) });
            continue;
        };
    }
    return patterns.toOwnedSlice() catch &.{};
}

fn getFilesInCurrentDir(gpa: std.mem.Allocator) GetFilesError!void {
    var arena = std.heap.ArenaAllocator.init(gpa);
    defer arena.deinit();
    const allocator = arena.allocator();

    var dir = std.fs.cwd().openDir(".", .{ .iterate = true }) catch {
        return GetFilesError.CantOpenCWD;
    };
    defer dir.close();

    const patterns = getPatterns(allocator, dir);

    var iter = dir.iterate();
    while (iter.next()) |maybe_entry| {
        const entry = maybe_entry orelse break;
        const name = entry.name;
        for (patterns) |p| {
            if (match(p, name)) continue;
        }
        std.debug.print("{s}\n", .{name});
    } else |err| {
        std.debug.print("Error while iterating through directory: {s}\n", .{@typeName(@TypeOf(err))});
    }
}

If I use GPA directly, I’d have to free memory manually in a few places throughout getPatterns and getFilesInCurrentDir.
If I use Arena, I don’t have to worry about that kind of stuff, since stuffs gets freed by Arena in the defer statement in getFilesInCurrentDir.

I have a few questions:

  1. Is my usage of ArenaAllocator in this case justified? Or am I abusing it for the sake of being lazy?
  2. How cheap / expensive is the cost of initiating ArenaAllocator?

Thank you very much!

Arenas are very cheap, in fact they are just 3 fields zig/lib/std/heap/arena_allocator.zig at master · ziglang/zig · GitHub

You can get it down to two fields if you use store the ArenaAllocator.State instead, however doing so requires you to pass it the child_allocator you used to create the original ArenaAllocator to promote back to it zig/lib/std/heap/arena_allocator.zig at master · ziglang/zig · GitHub

7 Likes

On the contrary, you should reach for arenas when you can get away with it. There’s only a downside when your pattern ends up retaining a bunch of memory that could have been disposed of. Example: say you open a bunch of text files, read them in, and count the number of ! in each file. You don’t want to retain all that text in memory, so using an arena and emptying it at the end is not ideal.

For this, where you’re allocating a bounded amount of temporaries and freeing them when you’re done, an arena is great. It retains the text of .gitignore for a short time, but that lets you use .toOwnedSlice instead of copying the text of interest. It’s what I’d do.

2 Likes

Thank you for the answer.
After writing that program, I feel like Arena allows me to opt-in to manual memory management instead being forced to. Since I can use free() mid way if I want to, otherwise just deinit() the whole chunk in the end.
It’s definitely feel nicer for sure.