LEGv8 emulator

Hello everyone, here’s a probably too-long post about lemu, a LEGv8 emulator.

Background

LEGv8 is an ARMv8 like assembly language described in Computer Organization And Design Arm Edition by Patterson and Hennessy used for educational purposes.

The book does not come with a LEGv8 emulator, so my professor wrote one in C named legv8emul. However, this emulator is closed-source and largely a black box, with printf debugging the only way to see what’s actually going on.

What does it do?

lemu is an open-source LEGv8 emulator that provides more features to develop and inspect your programs:

Oracle testing

One of my favorite parts of this project was building the oracle fuzzer. This generates random programs that print the contents of all used registers. After that, it runs the program under lemu and legv8emul (my professor’s emulator) to ensure they both have the same output. I used this to find many bugs in both my interpreter and my professor’s.

Pointless benchmarking

When I first started doing benchmarks, I was disappointed to learn that lemu was consistently 40% or more slower than legv8emul. However, two diffs—both under 30 lines changed—helped meet and sometimes exceed the performance of legv8emul after profiling with poop and callgrind:

Here’s a benchmark that calculates fib(x) for 1 through 30 with recursion using poop.
$ uname -a
Linux archlinux 6.17.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000 x86_64 GNU/Linux
$ zig build -Doptimize=ReleaseFast -Dstrip
$ poop "lemu test/behavior/fib.lv8" "./legv8emul test/behavior/fib.lv8 -s 2000"
Benchmark 1 (32 runs): lemu test/behavior/fib.lv8
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           157ms ± 9.50ms     148ms …  192ms          2 ( 6%)        0%
  peak_rss           1.73MB ± 39.3KB    1.55MB … 1.74MB          3 ( 9%)        0%
  cpu_cycles          654M  ± 28.3M      552M  …  680M           2 ( 6%)        0%
  instructions       3.27G  ±  152M     2.67G  … 3.39G           2 ( 6%)        0%
  cache_references   2.00K  ± 1.10K     1.00K  … 5.92K           2 ( 6%)        0%
  cache_misses       1.40K  ±  821       709   … 4.50K           2 ( 6%)        0%
  branch_misses      1.02M  ± 47.2K      832K  … 1.07M           2 ( 6%)        0%
Benchmark 2 (20 runs): ./legv8emul test/behavior/fib.lv8 -s 2000
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           262ms ± 6.99ms     253ms …  279ms          1 ( 5%)        💩+ 67.3% ±  3.2%
  peak_rss           1.55MB ±    0      1.55MB … 1.55MB          0 ( 0%)        ⚡- 10.3% ±  1.0%
  cpu_cycles         1.17G  ± 31.2M     1.08G  … 1.19G           3 (15%)        💩+ 78.3% ±  2.6%
  instructions       4.50G  ±  113M     4.17G  … 4.55G           4 (20%)        💩+ 37.7% ±  2.4%
  cache_references   2.95K  ±  896      1.71K  … 4.63K           0 ( 0%)        💩+ 47.7% ± 29.5%
  cache_misses       2.14K  ±  748      1.33K  … 3.80K           0 ( 0%)        💩+ 52.7% ± 32.5%
  branch_misses      1.35M  ± 34.2K     1.25M  … 1.37M           4 (20%)        💩+ 33.0% ±  2.4%

Final notes

The Zig build system is awesome and extremely useful for building tests, generating files, and managing dependencies. I used lsp-kit for the LSP types and server, and zigline for the debugger REPL. All of the emulator’s features are also exposed via a Zig module with these optional dependencies.

If you have any suggestions for improving lemu, then feel free to leave a comment here or on the repository.

Thanks.

6 Likes