Memory guards between Zig and Global Assembly

whitehexagon · November 15, 2025, 10:54am

I now have interrupts working via some global assembly (aarch64/armv8) on my Zig PinePhone OS project Currently it just calls back into Zig via exported fn. But I need to avoid running ‘slow’ code in the Zig IRQ handlers.

So I was thinking to add some shared flags between asm and Zig for each IRQ. Maybe just some paired LDXR and STXR calls from the assembly side. And export that memory to Zig. Since I dont think Zig’s volatile pointers are designed for this? what would be the Zig way of interacting with these flags in a memory guarded way.

My assembly is very rusty, and not sure if I also need some Acquire semantics on both sides too. It obviously needs to be lightweight and fast. eg. I know my SoC has some spinlock hardware, but that seems a bit heavy handed for a simple memory guard. And my A53 doesnt seem to have CAS. Any thoughts or advice please?

ww520 · November 15, 2025, 6:27pm

Just to clarify, you want to share memory between different CPUs, right? Not particularly between the assembly and Zig code? Since the interrupt handler in assembly calling back into the Zig code runs on the same CPU, no memory guard is needed.

I assume your scenario is the following. Let me know if it’s off.

IRQ happens. A CPU (1) with the IRQ affinity is interrupted to handle it.
The IRQ assembly handler runs on the CPU1 calls into a Zig function [A].
The IRQ handling completes. The call returns from Zig [A] to assembly and the IRQ handler exits. CPU1 resumes its normal execution.
Meanwhile some Zig code [B] is running on another CPU (2).

You want to share memory between [A] and [B].

Zig has the Atomics primitives to deal with cross thread (cross CPU) memory access. Also the std.atomic.Value wrapper make life much easier to use the atomics on a specific type of value.

I believe the atomic primitives are ultimately translated by LLVM into each target platform’s memory load/store assembly. Instead of knowing which assembly instructions are used for each platform, you can use Zig atomics on both ends [A] and [B], and let LLVM deal with it.

var flag1: std.atomic.Value(bool) = std.atomic.Value(bool).init(false);

fn setFlag1(value: bool) void {
    flag1.store(value, std.builtin.AtomicOrder.seq_cst);
}

fn loadFlag1() bool {
    return flag1.load(std.builtin.AtomicOrder.seq_cst);
}

// IRQ assembly handler calls into this.
fn irq_callback(...) void {
    if (loadFlag1())
        ...
    else
        ...
    setFlag1(true);
}

// This can run on a different thread.
fn my_main_code(...) void {
    ...
    if (loadFlag1())
        ...
    else
        setFlag1(false)
        ...
}

Edit: note that LLVM was used for the older version of Zig. The latest version has its own backend codegen. Regardless letting the compiler to generate the memory access code is easier than keeping track of which instructions are being used by which versions of the codegen.

whitehexagon · November 16, 2025, 9:27am

Thanks for the detailed response. My brain is not quite multi-core capable yet

I want to avoid the assembly calling Zig, since the assembly only needs to safely set a flag in memory. This keeps my vector table very simple, since just a few instructions for a guarded value set, and clear the interrupt.

Then my Zig event loop can check those flags periodically (event loop). Hence my worry about value/memory visibility correctness, or clever compiler re-orderings. So my question was how does Zig interact with the same correctness guarantees.

But now that I explain it, I see that my Zig code could just call down to the same assembly routines. But you are right, my goal is to move IRQ to 2nd core, which is kinda ironic having just implemented GIC logic.

I also wasnt aware that same core memory manipulation didnt need guards. Thanks! that should simplify some of my worries. I was doing high performance multi-threaded Java for far too long, where even setting a 64bit value had some nuances. In the meantime the chips evolved and started breeding extra cores

ww520 · November 16, 2025, 11:14pm

Ah good, the assumption is confirmed at least. Yes, I agree that keeping things simple and self-contained in assembly is good. Perhaps you can look at the generated code for the target machine to see what assembly instructions are used for the Zig atomic operations. Just copy the assembly code into your IRQ handler. Then incorporate the Zig code in your main Zig program on the other end.

Here’s a minimum set of code for the atomic operations, flag.zig:

const AtomicOrder = @import("std").builtin.AtomicOrder;

// A direct memory mapped into the virtual address of every process.
const MEM_ADDR: u64 = 0x00000000_00001234;
const void_ptr: *u8 = @ptrFromInt(MEM_ADDR);
const mem_ptr: *u32 = @ptrCast(@alignCast(void_ptr));

export fn get() u32 {
    return @atomicLoad(u32, mem_ptr, AtomicOrder.seq_cst);
}

export fn set() void {
    @atomicStore(u32, mem_ptr, 1, AtomicOrder.seq_cst);
}

export fn clear() void {
    @atomicStore(u32, mem_ptr, 0, AtomicOrder.seq_cst);
}

Change MEM_ADDR to your choice of direct physical memory address. I’m not familiar how you set up the memory mapping in the OS. Setting up a range of physical memory addresses to have the same virtual memory address for all processes make shared access to the same memory location much easier in all places, in user mode and in kernel mode.

Here’s a Comipler Explorer link for the generated assembly code. Looks like it just uses LDAR and STLR for the aarch64-linux target. The assembly code can be copied and pasted into the irq handler.

Yes, memory access on the same CPU doesn’t need barrier operations as code running on it has direct access to the L1/L2 caches of the CPU. Memory barrier ops are needed to flush the changes in L1/L2 of the the CPU to the main memory and for the other CPUs to load from the main memory into their L1/L2. Java is a good place to learn about these things; it’s actually the first language to have a formal memory model defined.

whitehexagon · November 17, 2025, 3:57pm

Thanks for the feedback. It’s working, but after all that, I realised I need to signal the phone hardware that the interrupt has been handled (not just GIC), and that needs to happen synchronously with the IRQ, so I’m back to calling back to Zig, but the design is good and my I2C state machine is working much smoother. So I’ll call it a day and see if I can get these compass readings calibrated/decoded thanks again for your input.