A question about memory allocation

devengel · March 10, 2026, 9:26am

if i have a struct like this one

const A1 = struct{
  big_arr_1: [32768]u32,
  big_arr_2: [32768]u32,

  pub fn init(allocator: std.mem.Allocator) !*A1 {
      return try allocator.create(A1);
  }
};

is it safe to assume that big_arr_1 and big_arr_2 will also be allocated on the heap?
i’m asking because, i’m getting massive FPS drops if those two arrays are specified with a compile time known length

whereas allocating them at runtime with an allocator removes those FPS drops.

i don’t know the exact source of the FPS drops, i just know that making one change, which is runtime allocating those arrays, solves those FPS drops.

vulpesx · March 10, 2026, 9:43am

Arrays in zig are not secret pointers, unlike some languages ~~cough c cough~~, they are the full memory as a value.

Since you are allocating the struct they are in, yes they will be on the heap at runtime.

I think you are talking about using slices, which don’t have a comptime known length. In both cases they are heap allocated at runtime.

Can you share the code where you use them, ideally the code that causes the issue if you can narrow that down?

And what zig version are you using?

Unnecessary memory copies are the likely cause, zig used to have a bug where it copied arrays when indexing, but that should be fixed. It’s also possible that you have code that is copying them when it shouldn’t be.

h3llraizer · March 10, 2026, 10:19am

The arrays get allocated to where the allocator passed to the init method is set to e.g. heap or stack.

Here is a page allocator passed to init (heap):

Here is fixed buffer allocator passed to init (stack):

devengel · March 10, 2026, 10:20am

ah, that makes sense. here’s how i use those two arrays.

pub fn getBlock(self: *const LocalChunk, index: usize) Block {
    return @atomicLoad(Constants.Block, &self.blocks[index], std.builtin.AtomicOrder.acquire);
}

pub fn setBlock(self: *LocalChunk, index: usize, new_block: Block) BlockLight {
    @atomicStore(Constants.Block, &self.blocks[index], new_block, std.builtin.AtomicOrder.release);
    const data = new_block.blockData().light;
    self.light_emitter.setBit3D(index, data.emits);
    return self.fetchUpdateLight(index, data.default);
}

pub fn setLight(self: *LocalChunk, index: usize, new_light: BlockLight) void {
    @atomicStore(Constants.BlockLight, &self.lights[index], new_light, std.builtin.AtomicOrder.release);
}

pub fn getLight(self: *const LocalChunk, index: usize) BlockLight {
    return @atomicLoad(Constants.BlockLight, &self.lights[index], std.builtin.AtomicOrder.acquire);
}

pub fn fetchUpdateLight(self: *const LocalChunk, index: usize, new_light: BlockLight) BlockLight {
    return @atomicRmw(Constants.BlockLight, &self.lights[index], std.builtin.AtomicRmwOp.Xchg, new_light, std.builtin.AtomicOrder.acq_rel);
}

those are the only uses of those two arrays, everything else calls those funcs to get or set elements, right now the arrays are runtime allocated as well as the struct itself, so those funcs are built with that in mind

devengel · March 10, 2026, 10:23am

also just noticed fetchUpdateLight takes a constant pointer to chunk even though it mutates the struct, interesting.

vulpesx · March 10, 2026, 10:28am

Check the assembly to see if it’s copying the whole array, godbolt is good, but it’s a website so it’s not useable unless you can reduce the code to something small.

There are various tools to see assembly locally, objdump is what I use on linux, its usually preinstalled or comes with other dev tools.

Debuggers also typically let you see assembly.

Its also worth asking what optimise modes does the slowdown occur on, its still worth reporting regardless, but you could work around the issue by using a different mode. If your using debug on x86_64 linux, then you’d be using the custom backend which has less optimal output, but faster to compile, you can force the use of llvm in the exe/lib/etc options.

vulpesx · March 10, 2026, 10:30am

That is a serious bug, but do confirm it’s actually a problem and not just a typo

devengel · March 10, 2026, 10:37am

Oh no, this was straight up copied from my source code.
fetchUpdateLight is used in the program, so it’s not being ignored.
ill make a bug report with a reproducible code soon
EDIT: ^ this only works on heap allocated slices, on arrays, the compiler knows its a constant and throws an error

devengel · March 10, 2026, 10:41am

Just tested my code, issue still happens in debug, fast, and safe modes.
(i use latest master)

devengel · March 10, 2026, 11:01am

i don’t really know how to read asm, but i got a minimal asm code of the relevant functions using -fstrip and -O ReleaseSmall, this is with the arrays set with comptime known length

	.intel_syntax noprefix
	.file	"local_chunk"
	.text
	.globl	getBlock
	.type	getBlock,@function
getBlock:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	eax, dword ptr [rdi + 4*rsi]
	pop	rbp
	.cfi_def_cfa rsp, 8
	ret
.Lfunc_end0:
	.size	getBlock, .Lfunc_end0-getBlock
	.cfi_endproc

	.globl	setBlock
	.type	setBlock,@function
setBlock:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
.Lfunc_end1:
	.size	setBlock, .Lfunc_end1-setBlock
	.cfi_endproc

	.globl	fetchUpdateLight
	.type	fetchUpdateLight,@function
fetchUpdateLight:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	eax, edx
	xchg	dword ptr [rdi + 4*rsi + 131072], eax
	pop	rbp
	.cfi_def_cfa rsp, 8
	ret
.Lfunc_end2:
	.size	fetchUpdateLight, .Lfunc_end2-fetchUpdateLight
	.cfi_endproc

	.globl	setLight
	.type	setLight,@function
setLight:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	dword ptr [rdi + 4*rsi + 131072], edx
	pop	rbp
	.cfi_def_cfa rsp, 8
	ret
.Lfunc_end3:
	.size	setLight, .Lfunc_end3-setLight
	.cfi_endproc

	.globl	getLight
	.type	getLight,@function
getLight:
	.cfi_startproc
	push	rbp
	.cfi_def_cfa_offset 16
	.cfi_offset rbp, -16
	mov	rbp, rsp
	.cfi_def_cfa_register rbp
	mov	eax, dword ptr [rdi + 4*rsi + 131072]
	pop	rbp
	.cfi_def_cfa rsp, 8
	ret
.Lfunc_end4:
	.size	getLight, .Lfunc_end4-getLight
	.cfi_endproc

	.section	".note.GNU-stack","",@progbits

cryptocode · March 10, 2026, 11:24am

You’re sure the callsites are in turn used, so the function is actually analyzed? I have very similar code using rmw, and making the pointer argument const gives me the expected “cast discards const qualifier” error on master (just pulled today). 0.15.2 gives the same error on that code.

devengel · March 10, 2026, 11:26am

apologies, i should’ve clarified, i filed for a bug report, it only happens to heap allocated slices.
https://codeberg.org/ziglang/zig/issues/31444 is the issue i made.

cryptocode · March 10, 2026, 11:33am

Ah, interesting case, but I don’t think it has anything to do with heap allocation.
Same thing here without heap allocation:

const std = @import("std");

const Foo = struct {
    arr_1: []u32,
    arr_2: []u32,

    pub fn constButStillMutating(self: *const Foo, d: u32, index: usize) u32 {
        return @atomicRmw(u32, &self.arr_1[index], .Xchg, d, .acq_rel);
    }
};

pub fn main() !void {
    var arr1: [2]u32 = .{ 0, 0 };
    var arr2: [2]u32 = .{ 0, 0 };

    var v: Foo = undefined;
    var f = &v;
    f.* = .{
        .arr_1 = &arr1,
        .arr_2 = &arr2,
    };
    _ = f.constButStillMutating(2, 1);
}

The difference between your and my own code, is that you have slices in the fields and I use arrays, so there’s one more level of indirection in your case. So it’s probably fine.

cryptocode · March 10, 2026, 11:45am

To clarify, I think you’re just seeing constness not propagating, which is how Zig works

const std = @import("std");

const MutableStruct = struct {
    value: []i32 = undefined,
};

pub fn mutate(ptr: *const MutableStruct, new_value: i32) void {
    ptr.value[0] = new_value;
}

pub fn main() void {
    var arr: [2]i32 = .{ 0, 0 };

    const myStruct = MutableStruct{ .value = &arr };
    mutate(&myStruct, 20);
    std.debug.print("After mutation: {any}\n", .{myStruct.value});
}

vulpesx · March 10, 2026, 11:52am

the array version probably mixed us both up, as that one would be a problem. Ofc slices are pointers, and have their own constness.

Speaking of, if the data is in slices, then Op probably wouldn’t need to heap allocate the struct. Idk if they were doing that, but I figured I should point it out.

mnemnion · March 10, 2026, 4:41pm

Just as a random tip:

You don’t have to write out std.builtin.AtomicOrder.acquire, just .acquire will do (and is idiomatic).

A parameter is a result location, so the compiler will try to cast .acquire (an enum literal) to a std.builtin.AtomicOrder (an enum). No need to localize any names, it Just Works.