Allocator Questions

Coming from an automatic memory management world, I often have questions pop-up in my head when using allocators. Here are some:

  1. Is it OK to create many different arena allocators from the same general purpose allocator?
  2. Is it OK to create an arena allocator with another arena allocator as its child?
  3. If I have different parts of a program that use arena allocators, would it be better to just have a single one and init / deinit every time I needed to reuse it, versus having different arenas for each part?
  4. Is there a way to move an allocation from one allocator to another? I know you can copy with dupe, but I’m thinking more in the line of one allocator telling another “hey, here’s this allocation so you can manage it from here on, because I’m about to free everything else and go away.”
4 Likes
  1. Is it OK to create many different arena allocators from the same general purpose allocator?

Yes. For me I don’t think I’ve ever used a GPA as the backing for an arena, I normally use the simple PageAllocator underneath since in most cases GPA might be overkill, but there are definitely some use cases where that would make sense.

  1. Is it OK to create an arena allocator with another arena allocator as its child?

This is OK but I’m not sure it makes much sense. Ignoring the extra overhead, at best I don’t think this would provide any advantages over just using the initial arena allocator, at worst this this could cause things to stay around for longer than they would if you just used the original arena. I could be missing something here though.

  1. If I have different parts of a program that use arena allocators, would it be better to just have a single one and init / deinit every time I needed to reuse it, versus having different arenas for each part?

I believe that would be better yes. It should reduce the overall memory footprint and is also just a simpler design since you’re program will have less state to manage.

Some general advice here, start of with the “simplest design” first. If you get an idea that you think might be faster, write a benchmark, then try out your idea and see if your benchmark noticeably improves.

  1. Is there a way to move an allocation from one allocator to another? I know you can copy with dupe , but I’m thinking more in the line of one allocator telling another “hey, here’s this allocation so you can manage it from here on, because I’m about to free everything else and go away.”

That’s an interesting idea but none of the allocators can do this today. This would be possible for some allocators and others not possible. I would expect something like this to be a one-off use case in which case you could just roll your own custom allocation code for it. The std Allocator interface is there for you to play well with the rest of std and other libraries, but you can always make enhancements and add new features in your application.

5 Likes

Thanks a lot for the answers and advice @marler8997 , they clear a lot of things up. The case of arena with arena child has popped up when developing a recursive evaluator for an interpreted language, where somewhere deep in the recursion the evaluation of an AST node may require lots of temporary allocations, which an arena is perfect for. But all this is already happening within the context of an outer, more long-lived program-level arena. But as you point out, maybe this is all premature optimization if I don’t actually test it with some benchmarks.

As a side note, I switched from the GPA to the page_allocator and the benchmark shows more than 5x faster execution. This emphasizes the importance benchmark testing has and the impact the choice of an allocator can have on performance.

4 Likes

I have also noticed previously that changing the allocator can make a huge difference in running time. Sometimes it is easy to reason about (FixedBufferAllocator vs. anything else), but I would like to share my latest finding.

Here is my code solving AoC Day23 Part 2 (Btw any feedback on style or quality is welcome, I am mechanical engineer coding just for fun, so You have been warned :slight_smile: ):

const std = @import("std");
const math = std.math;

const Amphipod = enum {
    A,
    B,
    C,
    D,
};

const hallway_length = 7;
const siderooms = 4;
const sideroom_depth = 4;

const PathType = u4;

const State = struct {
    hallway: [hallway_length]?Amphipod = .{null} ** hallway_length,
    rooms: [siderooms][sideroom_depth]?Amphipod,
    fn print(self: @This()) []u8 {
        const blank =
            \\#############
            \\#...........#
            \\###.#.#.#.###
            \\  #.#.#.#.#\\
            \\  #.#.#.#.#\\
            \\  #.#.#.#.#
            \\  #########
        ;
        var buffer: [100]u8 = undefined;
        var mod: []u8 = buffer[0..blank.len];
        std.mem.copy(u8, mod, blank);
        for (self.hallway) |amp, idx| {
            if (amp != null) {
                const i: usize = switch (idx) {
                    0 => 14 + 1,
                    1 => 14 + 2,
                    2 => 14 + 4,
                    3 => 14 + 6,
                    4 => 14 + 8,
                    5 => 14 + 10,
                    6 => 14 + 11,
                    else => unreachable,
                };
                const a: u8 = switch (amp.?) {
                    .A => 'A',
                    .B => 'B',
                    .C => 'C',
                    .D => 'D',
                };
                mod[i] = a;
            }
        }
        for (self.rooms) |room, rid| {
            for (room) |amp, aid| {
                if (amp != null) {
                    const a: u8 = switch (amp.?) {
                        .A => 'A',
                        .B => 'B',
                        .C => 'C',
                        .D => 'D',
                    };
                    mod[14 + 14 + rid * 2 + 3 + 14 * aid] = a;
                }
            }
        }
        return mod;
    }
    fn sideRoomDone(self: @This(), room: usize) bool {
        std.debug.assert(room < siderooms);
        for (self.rooms[room]) |amp| {
            // https://github.com/ziglang/zig/issues/6059
            if (amp == null) {
                return false;
            } else if (amp != @intToEnum(Amphipod, room)) return false;
        }
        return true;
    }
    fn done(self: @This()) bool {
        // check if hallway is empty
        for (self.hallway) |item| {
            if (item != null) return false;
        }

        // every room is done
        for (self.rooms) |_, idx| {
            if (!self.sideRoomDone(idx)) return false;
        }
        return true;
    }
    fn sideRoomFreeSpace(self: @This(), room: usize) ?u3 {
        var ret: u3 = 0;

        std.debug.assert(room < siderooms);
        for (self.rooms[room]) |amp| {
            if (amp == null) {
                ret += 1;
            } else if (amp != @intToEnum(Amphipod, room)) {
                return null;
            }
        }

        std.debug.assert(ret <= sideroom_depth);
        return ret;
    }
    fn checkSideRoom(self: @This()) ?StateQueue {
        for (self.rooms) |_, idx| {
            const free = self.sideRoomFreeSpace(idx) orelse continue;
            if (free == 0) continue;

            // std.debug.print("FREE: {d}\n", .{free});

            for (self.hallway) |amp, hwidx| {
                if (amp == @intToEnum(Amphipod, idx)) {
                    const steps = (self.checkHWtoRoom(hwidx, idx) orelse continue) + free;

                    var ret: StateQueue = undefined;

                    ret.energy = (std.math.powi(usize, 10, @enumToInt(amp.?)) catch unreachable) * steps;

                    ret.state = self;
                    ret.state.hallway[hwidx] = null;
                    ret.state.rooms[idx][free - 1] = amp;

                    return ret;
                }
            }

            for (self.rooms) |other, oidx| {
                if (idx == oidx) continue;

                var pos: PathType = 0;
                for (other) |amp| {
                    if (amp == null) {
                        pos += 1;
                        continue;
                    }
                    if (amp == @intToEnum(Amphipod, idx)) {
                        var steps: PathType = undefined;
                        if (idx < oidx) { // moving left
                            const hwidx = oidx + 2;
                            // std.debug.print("L->R from: {d} to: {d}\n", .{ hwidx, idx });
                            steps = (self.checkHWtoRoom(hwidx, idx) orelse break) + free + pos;
                        } else { // moving right
                            const hwidx = oidx + 1;
                            // std.debug.print("R->L from: {d} to: {d}\n", .{ hwidx, idx });
                            steps = (self.checkHWtoRoom(hwidx, idx) orelse break) + free + pos;
                        }

                        // we can move amp to his sideroom
                        var ret: StateQueue = undefined;

                        ret.energy = (std.math.powi(usize, 10, @enumToInt(amp.?)) catch unreachable) * steps;

                        ret.state = self;
                        ret.state.rooms[oidx][pos] = null;
                        ret.state.rooms[idx][free - 1] = amp;

                        return ret;
                    }
                    // only first item can move, no need to check others
                    break;
                }
            }
        }

        return null;
    }
    fn checkHWtoRoom(self: @This(), from: usize, to: usize) ?PathType {
        // #############
        // #01.2.3.4.56#
        // ###0#1#2#3###
        //   #0#1#2#3#
        //   #########

        std.debug.assert(to < 4);
        std.debug.assert(from < 8);

        var steps: PathType = 0;
        if (from -| to >= 2) {
            var idx: usize = from -| 1;
            while (idx >= to + 2) : (idx -= 1) {
                // std.debug.print("LEFT from: {d} to: {d} step: {d}\n", .{ from, to, idx });
                if (self.hallway[idx] == null) steps += 2 else return null;
            }
        } else {
            var idx: usize = from + 1;
            while (idx <= to + 1) : (idx += 1) {
                // std.debug.print("RIGHT from: {d} to: {d} step: {d}\n", .{ from, to, idx });
                if (self.hallway[idx] == null) steps += 2 else return null;
            }
        }
        // room entrance point
        if (from != 6 and from != 0) steps += 1;

        return steps;
    }
    fn getValidStates(self: @This(), allocator: std.mem.Allocator) ![]StateQueue {
        var ret = std.ArrayList(StateQueue).init(allocator);

        // Amphipod can move to its final place, always a good move
        if (self.checkSideRoom()) |item| {
            try ret.append(item);
            return ret.toOwnedSlice();
        }

        // we already checked the side rooms, so we only have to collect
        // each rooms first item's possible hallway positions
        for (self.rooms) |room, idx| {
            // std.debug.print("ROOM->HW check {d} room\n", .{idx});
            for (room) |amp, pos| {
                if (amp == null) continue;

                // do not move items in right room
                // except: we should move away to let wrong placed amp out
                if (amp == @intToEnum(Amphipod, idx)) {
                    var tainted: bool = false;

                    var next: PathType = @intCast(PathType, pos) + 1;
                    while (next < sideroom_depth) : (next += 1) {
                        if (room[next] != @intToEnum(Amphipod, idx)) tainted = true;
                    }
                    if (!tainted) break;
                }

                for (self.hallway) |hw, hwidx| {
                    if (hw == null) { // destination hallway spot is free
                        // std.debug.print("checking hallway spot {d} with amp {any}\n", .{hwidx, amp});
                        const steps = (self.checkHWtoRoom(hwidx, idx) orelse continue) + pos + 1;

                        var st: StateQueue = undefined;

                        st.energy = steps * try std.math.powi(usize, 10, @enumToInt(amp.?));

                        st.state = self;
                        st.state.hallway[hwidx] = amp;
                        st.state.rooms[idx][pos] = null;

                        try ret.append(st);
                    }
                }
                // only first item can move
                break;
            }
        }

        return ret.toOwnedSlice();
    }
};

const StateQueue = struct {
    state: State,
    energy: usize,
};

pub fn main() !void {
    var timer = try std.time.Timer.start();
    const ret = try second();
    const t = timer.lap() / 1000;

    try std.testing.expectEqual(@as(usize, 43226), ret);

    std.debug.print("Day 23b result: {d} \t\ttime: {d}µs\n", .{ ret, t });
}

fn lessThan(context: void, a: StateQueue, b: StateQueue) std.math.Order {
    _ = context;
    if (a.energy == b.energy) {
        return .eq;
    } else if (a.energy < b.energy) {
        return .lt;
    } else if (a.energy > b.energy) {
        return .gt;
    } else unreachable;
}

pub fn second() !usize {
    var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
    defer arena.deinit();
    const allocator = arena.allocator();

    var queue = std.PriorityQueue(StateQueue, void, lessThan).init(allocator, {});
    defer queue.deinit();

    var visited = std.AutoHashMap(State, void).init(allocator);
    defer visited.deinit();

    // var backtrack = std.AutoHashMap(StateQueue, StateQueue).init(allocator);
    // defer backtrack.deinit();

    var init_state: StateQueue = undefined;
    init_state.state = try parseInput();
    init_state.energy = 0;

    try queue.add(init_state);

    while (queue.count() > 0) {
        // var steps: usize = 0;
        // while (steps < 1) : (steps += 1) {
        const curr = queue.remove();

        // std.debug.print("{d}\n{s}\n", .{ curr.energy, curr.state.print() });
        // std.debug.print("{d} ", .{curr.energy});

        // skip if already checked
        if (visited.contains(curr.state)) continue;
        try visited.put(curr.state, {});

        // Amphipods are in place, return energy
        if (curr.state.done()) {
            // var bt: StateQueue = curr;
            // while (backtrack.contains(bt)) {
            //     std.debug.print("{d}\n{s}\n", .{bt.energy, bt.state.print()});
            //     bt = backtrack.get(bt).?;
            // }
            return curr.energy;
        }

        var possible_states = try curr.state.getValidStates(allocator);
        defer allocator.free(possible_states);

        for (possible_states) |*sq| {
            sq.energy += curr.energy;
            // std.debug.print("{d}\n{s}\n", .{ sq.energy, sq.state.print() });
            try queue.add(sq.*);
            // try backtrack.put(sq.*, curr);
        }
    }

    unreachable;
}

fn parseInput() !State {
    const input =
    \\#############
    \\#...........#
    \\###D#D#B#A###
    \\  #B#C#A#C#
    \\  #########
    ;
    var lines = std.mem.split(u8, input, "\n");

    var ret = State{ .rooms = undefined };
    ret.rooms[0][1] = .D;
    ret.rooms[1][1] = .C;
    ret.rooms[2][1] = .B;
    ret.rooms[3][1] = .A;
    ret.rooms[0][2] = .D;
    ret.rooms[1][2] = .B;
    ret.rooms[2][2] = .A;
    ret.rooms[3][2] = .C;

    var line: usize = 0;
    var counter: usize = 0;
    while (lines.next()) |l| : (line += 1) {
        if (line == 2 or line == 3) {
            for (l) |ch, idx| {
                if (idx == 3 or idx == 5 or idx == 7 or idx == 9) {
                    const amp: Amphipod = switch (ch) {
                        'A' => .A,
                        'B' => .B,
                        'C' => .C,
                        'D' => .D,
                        else => unreachable,
                    };
                    // std.debug.print("{d} {d}\n", .{counter%siderooms, counter/siderooms});
                    ret.rooms[counter % siderooms][counter / siderooms] = amp;
                    counter += 1;
                }
            }
        }
        if (line == 2) counter += siderooms * 2;
    }

    return ret;
}

test "day23b" {
    try std.testing.expectEqual(@as(usize, 43226), try second());
}

When I use std.testing.allocator it runs for 5225 ms on my rpi4. Using std.heap.page_allocator makes it faster: 3395 ms. But I was blown away with the final version (using ArenaAllocator) which finishes in 1365 ms.
If I use std.testing.allocator as the child allocator for ArenaAllocator the running time remains the same.

My theory: I call getValidStates() a lot. When I use ArenaAllocator there it do not have to reinitialize the allocator on every round, it just use the already initialized one which makes it lot faster.
(To confirm this theory I have changed the PriorityQueue and AutoHashMap allocator to std.testing.allocator and got the very same runtime.)

What do You think?

2 Likes

I’ve also seen huge speed-ups when switching to ArenaAllocator. On the other hand, I also see increased total memory footprint, so as always, you have to consider the tradeoffs and constraints of the specific project. If memory is really scarce, maybe an arena isn’t the best option or not an option at all. Benchmark testing is crucial to see what’s best.

2 Likes

Generally you’d see a speedup with an ArenaAllocator if you have to allocate/deallocate a lot during the lifetime of your program. Allocating memory blocks is an expensive operation, which ArenaAllocator amortizes by allocating large chunks of memory at a time and then allocating memory from your objects from within that preallocated memory.

1 Like