Zjvm: JVM in Zig

lyledean1 · September 7, 2025, 10:55am

I’ve written a toy JVM in Zig, mainly for my own learning but also as an experience to use Zig in a larger project. Has enough features to make it useful, but still a few to implement.

mbrock · September 16, 2025, 1:55pm

That’s cool!

I first thought all your object pointers were represented as *anyopaque, but then I realized that class instances are actually IDs sent through @ptrFromInt, and basically only the mock string builder stuff uses real pointers?

When I made a Lisp VM in Zig, I found it pleasant to completely avoid pointer types in my value representation, using only indices into a few different array lists. That way, for example, the whole heap can be serialized portably with almost no code, and the actual memory can be reallocated or copied or whatever without invalidation or aliasing.

My “stack values” were actually 32-bit tagged words. They could be 31 bit immediate numbers, or they could be references, which are typed indices, where a 5 bit type tag allows 32 different pools/lists/tables—one for cons pairs, one for symbols, one for byte vectors, and so on.

The garbage collector is a simple stop-and-copy. It walks through the roots, moving everything to new array lists, leaving behind relocation signs, and when it’s done, all the array lists are compact and related values tend to become adjacent. It took some effort and reading to get this right, but it was very interesting!

hvbargen · September 16, 2025, 3:16pm

Interesting. I did this >30 years ago. Recently someone said in a different thread, that allocators inevitably leak memory due to small gaps between objects, and I felt tempted to contradict, because by using handles instead of pointers, moving data around in memory is no problem, eh in a compacting GC. Of course at the cost of an additional indirection…

mbrock · September 16, 2025, 7:07pm

Hmm, now I’m thinking about a compacting garbage collection pass as a variant of a pattern I’ve implemented in Zig several times: using an arena allocator during the construction of some complex structure, creating a suboptimal object graph with waste and fragmentation, then afterwards moving the desired subgraph into a new space where siblings are memory-adjacent, and finally freeing all the garbage… Like, I’m working on a PEG parsing library, and it’s that exact pattern: the messy backtracking process that builds a fragmented parse tree—then after parsing is final, tidying everything up into a nicely contiguous syntax tree…

pachde · September 16, 2025, 7:21pm

Re

This isn’t a leak: this is memory fragmentation. A leak is an allocation that is never freed.

Fragmentation and leaks can both or together lead to out-of-memory conditions despite not all of memory being actively used.

cztomsik · September 17, 2025, 4:45am

FYI GitHub - plasma-umass/Mesh: A memory allocator that automatically reduces the memory footprint of C/C++ applications.

Be sure to watch the video, very interesting https://www.youtube.com/watch?v=c1UBJbfR-H0

mnemnion · September 17, 2025, 4:06pm

Welcome to Ziggit @lyledean1!

This is a nice first project. Many people think the JVM is more esoteric than it is, because of the immense complexity of “production” implementations, but far from it, it’s a fairly straightforward and comprehensible virtual machine. Which is not to say that implementing it is an easy matter.

A tip: you’ve implemented a classic VM here, as a while loop which reads from the program counter and has a master dispatch switch on the opcode. When the PC falls off the end of the bytecode, you’re done.

The disadvantage of this approach is that master dispatch switch: it’s translated to one single machine instruction. This inflicts all of the Turing-complete complexity of the executing program on the branch predictor, all at once, which saturates it, but real programs are much more predictable than this.

I encourage you to look into labeled switch continue, which will let you spread that out. Instead of incrementing the program counter at the base of the while loop, you’ll want to do a labeled continue to the next opcode inside each switch branch.

That way, every bytecode instruction gets its own CPU instruction on which to do the next dispatch, so the branch predictor can make use of any regularities in what tends to follow what, of which there are many. This reliably makes VMs somewhat faster.

While you’re at it, you can get rid of the while loop entirely. Add a phantom JVM opcode stop, and make the bytecode array a sentinel terminated one: [:Bytecode.stop]Bytecode. Then you can break from that, and eliminate an entire length check from every single operation. I believe the ‘done thing’ here is to use one of the two reserved implementation-defined bytecodes, but there are several more available (as you clearly know).

Fun fact: the JVM is designed, like WASM, so that static analysis can confirm that every jump in a program stays within the program boundary. I don’t know if your loader does this, or does it yet, but it’s possible by design.