Allocator swapping and type consistency

markus · May 22, 2025, 6:07pm

Hello,

When writing Zig I like to use the following pattern, as it allows me to swap out my allocator based on optimizemode or even build script flags (which ive never done before but I think there are probably tons of reasons to do this) in a nice way:

pub fn main() !void {
    var alloc_impl = switch (builtin.mode) {
        .Debug => std.heap.DebugAllocator(.{}).init,
        else => std.heap.ArenaAllocator.init(std.heap.page_allocator),
    };
    defer _ = alloc_impl.deinit();
    const allocator = alloc_impl.allocator();

    _ = allocator;
    // ...
}

const std = @import("std");
const builtin = @import("builtin");

The Problem is, that this stop working once I want to use an allocator like smp_allocator or c_allocator. I wonder if it would be a good idea to make all allocators be inside a struct, have init, deinit and allocator functions, or alternatively provide a wrapper for the ones that dont.

What is your opinion on this? Does anyone else even do this?

swenninger · May 22, 2025, 6:30pm

The Zig 0.14.0 Release Notes suggest to use something like

var debug_allocator: std.heap.DebugAllocator(.{}) = .init;

pub fn main() !void {
    const gpa, const is_debug = gpa: {
        if (native_os == .wasi) break :gpa .{ std.heap.wasm_allocator, false };
        break :gpa switch (builtin.mode) {
            .Debug, .ReleaseSafe => .{ debug_allocator.allocator(), true },
            .ReleaseFast, .ReleaseSmall => .{ std.heap.smp_allocator, false },
        };
    };
    defer if (is_debug) {
        _ = debug_allocator.deinit();
    };
}

markus · May 22, 2025, 6:59pm

While I get that this works it does feel quite messy. Someonething I have tried is to write a thin wrapper around smp allocator myself which does work nicely. Its really just a struct eith noop methods and allocator() returning smp allocator.

Cloudef · May 23, 2025, 1:27am

Yeah I usually wrap main allocator in struct like this:

const AppAllocator = struct {
    pub const is_debug = blk: {
        if (builtin.os.tag == .wasi) break :blk false;
        break :blk switch (builtin.mode) {
            .Debug, .ReleaseSafe => true,
            .ReleaseFast, .ReleaseSmall => false,
        };
    };

    debug_allocator: if (is_debug) std.heap.DebugAllocator(.{}) else void,

    pub const init: @This() = switch (is_debug) {
        true => .{ .debug_allocator = .init },
        false => .{ .debug_allocator = {} },
    };

    pub fn allocator(self: *@This()) std.mem.Allocator {
        if (builtin.os.tag == .wasi) return std.heap.wasm_allocator;
        return switch (is_debug) {
            true => self.debug_allocator.allocator(),
            false => std.heap.smp_allocator,
        };
    }

    pub fn deinit(self: *@This()) void {
        if (is_debug) _ = self.debug_allocator.deinit();
        self.* = undefined;
    }
};

vulpesx · May 23, 2025, 1:59am

It’s messy because it’s condensing logic,
this should look better.

pub fn main() !void {

    const is_debug = if (native_os == .wasi) false
        else switch (builtin.mode) {
            .Debug, .ReleaseSafe => true,
            .ReleaseFast, .ReleaseSmall => false,
    };

    var debug_allocator: std.heap.DebugAllocator(.{}) = .init;
    defer if (is_debug) {
        _ = debug_allocator.deinit();
    };

    const gpa = 
        if (native_os == .wasi) std.heap.wasm_allocator
        else if (is_debug) debug_allocator.allocator()
        else std.heap.smp_allocator;
}

AndrewKraevskii · May 23, 2025, 2:30am

There is an attempt to hide this logic add std.heap.AutoAllocator by gooncreeper · Pull Request #23432 · ziglang/zig · GitHub

Zambyte · May 23, 2025, 2:05pm

It’s worth noting that this will always compile the DebugAllocator into your program, even if it’s not used. Only top level declarations are lazily compiled. Putting something like this at the top level of your main file may be better.

const is_debug = builtin.mode == .Debug or buildin.mode == .ReleaseSafe;
var dba: std.heap.DebugAllocator(.{}) =
    if (is_debug)
        .init
    else
        @compileError("Should not use debug allocator in release mode");

pub fn main() !void {
    defer if (is_debug) {
        _ = dba.deinit();
    };
    
    const gpa = 
        if (builtin.os.tag == .wasi) std.heap.wasm_allocator
        else if (is_debug) dba.allocator()
        else std.heap.smp_allocator;
    // ...
}

vulpesx · May 24, 2025, 3:23am

Yes that is how zigs lazy evaluation works, but unused variables do get optimised out, at least in release modes which is all we care about.

Zambyte · May 24, 2025, 1:30pm

Unused variables inside functions are not optimized out in ReleaseFast nor ReleaseSafe. Check the difference in binary size when defining the unused dba inside main vs outside with ReleaseFast.

Cloudef · May 24, 2025, 1:52pm

That would completely break comptime conditionals in zig.
Both codes produce the same binary:

Zambyte · May 24, 2025, 2:22pm

“cached (207260B) ~12789 lines filtered” vs “cached (195250B) ~12495 lines filtered”. Where does that difference come from?

Zambyte · May 24, 2025, 2:28pm

Seeing a lot of DebugAllocator in the diff between the two exact same binaries

 .Linfo_string315:
-        .asciz  "hash_map.HashMapUnmanaged(usize,heap.debug_allocator.DebugAllocator(.{ .stack_trace_frames = 0, .enable_memory_limit = false, .safety = false, .thread_safe = true, .MutexType = null, .never_unmap = false, .retain_metadata = false, .verbose_log = false, .backing_allocator_zeroes = true, .resize_stack_traces = false, .canary = 10534666094765928719, .page_size = 131072 }).LargeAlloc,hash_map.AutoContext(usize),80).Metadata"
+        .asciz  "x86_64_sysv"
 .Linfo_string316:
-        .asciz  "*hash_map.HashMapUnmanaged(usize,heap.debug_allocator.DebugAllocator(.{ .stack_trace_frames = 0, .enable_memory_limit = false, .safety = false, .thread_safe = true, .MutexType = null, .never_unmap = false, .retain_metadata = false, .verbose_log = false, .backing_allocator_zeroes = true, .resize_stack_traces = false, .canary = 10534666094765928719, .page_size = 131072 }).LargeAlloc,hash_map.AutoContext(usize),80).Metadata"
+        .asciz  "incoming_stack_alignment"
 .Linfo_string317:
-        .asciz  "size"
+        .asciz  "?u64"
 .Linfo_string318:
-        .asciz  "available"
+        .asciz  "builtin.CallingConvention.CommonOptions"
 .Linfo_string319:
-        .asciz  "hash_map.HashMapUnmanaged(usize,heap.debug_allocator.DebugAllocator(.{ .stack_trace_frames = 0, .enable_memory_limit = false, .safety = false, .thread_safe = true, .MutexType = null, .never_unmap = false, .retain_metadata = false, .verbose_log = false, .backing_allocator_zeroes = true, .resize_stack_traces = false, .canary = 10534666094765928719, .page_size = 131072 }).LargeAlloc,hash_map.AutoContext(usize),80)"
+        .asciz  "x86_64_win"
 .Linfo_string320:
-        .asciz  "hash_map.HashMapUnmanaged(usize,heap.debug_allocator.DebugAllocator(.{ .stack_trace_frames = 0, .enable_memory_limit = false, .safety = false, .thread_safe = true, .MutexType = null, .never_unmap = false, .retain_metadata = false, .verbose_log = false, .backing_allocator_zeroes = true, .resize_stack_traces = false, .canary = 10534666094765928719, .page_size = 131072 }).LargeAlloc,hash_map.AutoContext(usize),80).empty"
+        .asciz  "x86_64_regcall_v3_sysv"

Cloudef · May 24, 2025, 2:34pm

Weird, there is no DebugAllocator references in my binaries. What’s your target?

~/d/p/tmp ❱ nix run github:Cloudef/zig2nix#latest -- build-obj -OReleaseFast a.zig
~/d/p/tmp 5.7s ❱ nix run github:Cloudef/zig2nix#latest -- build-obj -OReleaseFast b.zig

~/d/p/tmp 5.9s ❱ objdump -d a.o.o | grep DebugAllocator
~/d/p/tmp | 0 1 ❱ objdump -d b.o.o | grep DebugAllocator
~/d/p/tmp | 0 1 ❱ du -h a.o b.o
2.1M	a.o
2.1M	b.o

Zambyte · May 24, 2025, 2:36pm

The diff I posted was from the godbolt link you sent. Can you show the output of du -b a.o b.o?

Cloudef · May 24, 2025, 2:37pm

2184832	a.o
2176872	b.o

Seems there’s slight difference which doesn’t really seem to make sense to me.
The code produced is the same, but the scaffolding seems to be affected?

Cloudef · May 24, 2025, 2:45pm

Simpler comparision without all the std.io junk
https://godbolt.org/z/vf5P6K6nf

seems, a.zig puts this in the assembly

0000000000000000 <lel>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   be 00 10 00 00          mov    $0x1000,%esi
   9:   31 d2                   xor    %edx,%edx
   b:   5d                      pop    %rbp
   c:   e9 4f 01 00 00          jmp    160 <heap.SmpAllocator.alloc>
  11:   66 66 66 66 66 66 2e    data16 data16 data16 data16 data16 cs nopw 0x0(%rax,%rax,1)
  18:   0f 1f 84 00 00 00 00
  1f:   00

compared to b.zig

0000000000000000 <lel>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   be 00 10 00 00          mov    $0x1000,%esi
   9:   31 d2                   xor    %edx,%edx
   b:   5d                      pop    %rbp
   c:   eb 02                   jmp    10 <heap.SmpAllocator.alloc>
   e:   66 90                   xchg   %ax,%ax

Zambyte · May 24, 2025, 2:45pm

It might be a bug, but I think it’s actually expected behavior. I was trying to solve this problem a while back (specifically I was trying to make a variable that would be a compile error to reference in a release build), and I only realized this was the solution when I heard Andrew specifically clarify that top level declarations are lazily compiled.

By declaring the debug allocator inside the function, it is eagerly compiled, which includes the DebugAllocator code, but since it is never used in the function, it doesn’t change the output of that functions code. ReleaseSmall goes through the effort of cleaning up the dead DebugAllocator code, but not other build modes.

Cloudef · May 24, 2025, 2:53pm

Top-level declarations being lazily analyzed is own thing, but branches on comptime variables also trigger dead code elimination as the non eliminated branches won’t refer to any of the declarations before. It might be that zig’s dead code elimination is not yet that great.

const std = @import("std");
const builtin = @import("builtin");

export fn lel() [*]const u8 {
    const is_debug = if (builtin.target.os.tag == .wasi) false
        else switch (builtin.mode) {
            .Debug, .ReleaseSafe => true,
            .ReleaseFast, .ReleaseSmall => false,
    };

    var debug_allocator = if (is_debug) std.heap.DebugAllocator(.{}).init
        else {};
    defer if (is_debug) {
        _ = debug_allocator.deinit();
    };

    const gpa =
        if (builtin.target.os.tag == .wasi) std.heap.wasm_allocator
        else if (is_debug) debug_allocator.allocator()
        else std.heap.smp_allocator;

    return (gpa.alloc(u8, 4096) catch unreachable).ptr;
}

Here is version that produces smaller binary for example

~/d/p/tmp ❱ du -b a.o b.o
73720	a.o
73752	b.o

Zambyte · May 24, 2025, 3:00pm

In release builds, the type of debug_allocator is void rather than DebugAllocator, which is why the DebugAllocator code is not included. If you did this instead

    var debug_allocator: std.heap.DebugAllocator(.{}) = if (is_debug) .init
        else undefined;
    defer if (is_debug) {
        _ = debug_allocator.deinit();
    };

It would include the DebugAllocator code I bet.

Cloudef · May 24, 2025, 3:02pm

Yeah, but it was more to showcase my bafflement for the bad dead code elimination I guess. Also I’d expect the void version to give equal size to your code, as the top level declarations would not be analyzed. But idk, maybe some compiler dev can come and chip in with more insight.