Load and Run a 2nd binary

whitehexagon · March 4, 2026, 5:12pm

So my Zig PinePhone OS is ready for some Apps After my recent refactoring, I now have quite a nice client API. So I can easily build and include some simple Apps with the OS.

But if I want to distribute bigger Apps separately, I’m a bit lost. I’m new to OS/systems programming, so looking for some guidance please. e.g. my OS is loaded, and user selects a binary to load from the SD card.

My first idea is to build my OS with a know address range of API function pointers that a 2nd binary can lookup? Maybe the first bunch of 32bits numbers in the OS binary are a fixed address set of API function pointers, that somehow get updated each time I build my OS. Looks like @intFromPtr on some fn pointers is the way to go there?

A loaded binary/App can then lookup these address/fn pointer u32 from that table, and goto that address.

Am I on the right track with my thinking? How do my Apps have a nice API to work with, that doesnt have implementation until loaded by my OS? Do I just have some kinda dummy API stubs that do goto? I thought @call might be it, but that seems to require compile time known symbols? So maybe some assembly is the way to goto?

mnemnion · March 4, 2026, 10:59pm

It’s not totally clear which part of this you’re asking about: loading position-independent code into memory, syscalls, how to do dynamic linking, general ‘pack object code into an executable’ stuff, maybe all of this?

If it were up to me, I’d be reading everything I could get my hands on about ELF, and probably just doing what ELF does. That’s if I understood you, which I’m not sure about.

whitehexagon · March 6, 2026, 12:22pm

Agreed, it is not clear to me either The last time I was developing stuff this low level was 68000 on the Amiga, so please forgive my ignorance. But your post gave some good entry points for further research which is really helpful, thanks!

So I was thinking I already have an ‘OS’, but now I think the term I was looking for is possibly Microkernel, and I’m not even sure I have that yet

But if we take something simple, like plot(x,y). My microkernel holds an implementation of that which includes access to the SoC/hardware. Above the hardware layer I have generated some more user-friendly API for things like line() which delegate to plot().

Currently, the built in apps are part of my microkenel, and have simple easy access to that API. Now I try to imagine how an 2nd Zig binary (objcopy bin) loaded at runtime is going to work. We can ignore the portability ELF would offer, and assume it is just a blob of arm assembly with a known entry point. Maybe this is what you mean by position-independent code?

One solution is to just build each app as a complete microkernel, if that is the right term here, but this obviously leads to complexity accessing shared hardware resources.

Another idea, which I portrayed badly, was that my microkernel somehow has a table of pointers to the implementations of this API. Maybe I can use the linker file to fix the location. But that table would be at a fixed publicly known location in RAM. So for shared hardware resources I can manage concurrency more easily, since every access goes via these system? calls. I think DOS VGA programming worked something like this.

What I cant get my head around is how the 2nd binary can make a call to these addresses/function pointers. It is something like an @extern? but without the implementation in the compile unit. So here I was thinking I’d probably need some assembly to do a ‘BL’. Then my line() API for the 2nd binary would wrap a bit of assembly that would prepare registers and goto/gosub the fixed address for the correct function. I hope that makes more sense now?

And callbacks maybe work in a similar way, a RAM address of a fn is passed to the microkernel, and that uses some assembly to call back to the 2nd app.

So I think I’m asking how to do ‘extern’ without the symbol known to the compiler? Is assembly the right way to go here?

Calder-Ty · March 6, 2026, 1:27pm

It sounds like your asking how to set up syscalls in your kernel for use space applications to call.

Your idea that each application writes it’s own microkernel is cool, but not the normal way, for the reasons you explain, how do you manage shared resources and scheduling which microkernel has the CPU and how long.

I’d suggest looking at how zig calls syscall, which does include inline assembly. I’d also look at How Unix does fork/exec. “Operating Systems: Three Easy Pieces” has a chapter on it. The book is free online, but is well worth the purchase.

floooh · March 6, 2026, 2:12pm

Why not simply copy AmigaOS, it was very simple and elegant

E.g. the OS exposed its services as position-independent dynamic libraries, and the DLL loading system was much simpler than Linux (more like Win32), e.g. a DLL simply exposed one (or several) raw jump tables, and all the association to actual function names was done in header files.

There also was no automatic dynamic linking. An AmigaOS program had to load DLLs explicitly via the Exec calls OpenLibrary() and close them via CloseLibrary() on shutdown, then obtain an interface from the library (hmm, I don’t seem to remember that part… might have been added in a later AmigaOS version?), and this interface pointer is actually a jump table with the public library functions mapped to a C struct, so you would call library functions like this:

interface->Func(a, b, c);

Very simple and elegant.

PS: looks like this interface stuff was added in AmigaOS 4 (after my Amiga time): Libraries and Devices - AmigaOS Documentation Wiki

mnemnion · March 6, 2026, 4:49pm

I wasn’t thinking in terms of portability, just that ELF already has all the things an object format needs to have, it’s extensively documented, it’s compatible with DWARF (you’re going to want to debug!), and there’s no obvious reason to do anything differently from how ELF is already doing it.

The other two replies are also full of good advice. Just wanted to clarify about portability, ELF is used by several operating systems but is not especially portable between them. Although Justine manages, somehow…

pzittlau · March 6, 2026, 7:32pm

Using ELF as an object format is fine for nearly everything, as it’s quite simple and actually very extensible, while still being easy to work with. But, at least for me, DWARF just frustrates me.

I’m kind of hoping we sometime in the future get a widely supported format, that isn’t designed for machines from >30 years ago. For their time the design choices made sense, but today we can just map a giant flat table into memory to index into, instead of a complex tree of variable sized entries and a turing-complete stack machine.

chung-leong · March 6, 2026, 10:25pm

The simplest way to do this is to pass a function pointer to the app’s entry point, which would fill out different v-table based on an id received. Something like this:

var os: APIs(struct {
    x: FnGroupX,
    y: FnGroupY,
}) = .{};

export fn appMain(fn_addr: usize) void {
    importAPIs(&os, fn_addr);
    // ...
}

The way I see it, if you’re creating a OS written in Zig, then go balls out making it as Ziggish as possible. No point in reusing infrastructure that was designed around the convention of a C API.

hvbargen · March 7, 2026, 7:15pm

OT here, but very interesting. Did anyone try something like this with Zig?

mnemnion · March 8, 2026, 10:34pm

Someone did indeed!

whitehexagon · March 17, 2026, 12:37pm

Thank-you to everyone for the ideas and input.

So I realised that I can probably simplify some of this by making use of the other 3 unused cores I support max two Apps open (split screen), so they can have a core each. Which should just be a case of an extra stack pointer and set the core PC entry address…

So last week I rewrote (was polling) my touch-screen handling to make use of IRQ and GIC routing. So in I can now route touch events to both cores and filter by Y coordinate.

I have to say that it was really pleasant to implement this work in Zig I have an IRQ driven state machine for the TWI (similar to I2C). The touch-screen API is now a 2nd state machine, also now driven by IRQ and message passing over the TWI interface. The code is cleaner, and so much smoother than before. I can even have ‘Paint’ App open top & bottom, and seamlessly draw from one App over to the next.

So I’ll end up with another mixture of Zig and global assembly, but hopefully better concurrency than the time sliced single-core approach. Time will tell

The screen sharing is simpler, just a split framebuffer. And I still have to complete audio, but hopefully I can just split the channels one per app as well. So maybe, this is the way.

Although multiple cores might be wasteful in terms of power/battery, and I am not currently CPU bound. Only bottleneck still seems to be drawing to screen, even with MMU configured. But I only have one full-screen App so far that struggles a bit, a WIP pacman that I last implemented in 68000 on the Amiga. Just to complete the circle