The Zig compiler seems much slower than other LLVM-based compilers like rustc or clang.
The benchmark below compiles a minimal binary in debug mode.
For Zig, almost all of the time is spent in phase LLVM Emit Object.
How does Zig use LLVM differently so that it’s 8x slower than Clang and 4.5x slower than rustc?
echo 'pub fn main() void {}' > main.zig
echo 'fn main() {}' > main.rs
echo 'int main() { return 0; }' > main.cc
hyperfine --shell=none --export-markdown out.md \
'zig build-exe main.zig' \
'rustc main.rs' \
'g++ main.cc -o main'
zig version # => 0.13.0
rustc --version # => rustc 1.78.0
clang++ --version #=> clang version 17.0.6
Results
| Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
clang++ main.cc -o main |
0.157 ± 0.008 |
0.142 |
0.171 |
1.00 |
rustc main.rs |
0.282 ± 0.041 |
0.243 |
0.381 |
1.79 ± 0.28 |
zig build-exe main.zig |
1.269 ± 0.080 |
1.127 |
1.360 |
8.06 ± 0.66 |
1 Like
Are you sure what you’re building is comparable? You may be building glibc in the zig build, whereas you are linking it for the two other builds.
Assuming you’re on Linux, if you look at the LLVM module that is being compiled it will be evident. Rust and Clang emit only the main function and then rely on a precompiled libc for everything, whereas Zig is producing a static executable that has compiled all the parts of the standard library that are depended on. In particular, debug builds almost always depend on the ability to print a stack trace, which ends up being a fair amount of code.
You can get the three compilers to do a similar amount of work by having them emit object files instead:
Benchmark 1 (242 runs): clang -c main.c
measurement mean ± σ min … max outliers delta
wall_time 20.6ms ± 1.51ms 18.8ms … 25.6ms 6 ( 2%) 0%
peak_rss 96.6MB ± 193KB 95.7MB … 96.7MB 5 ( 2%) 0%
cpu_cycles 50.4M ± 1.92M 46.4M … 56.3M 2 ( 1%) 0%
instructions 76.2M ± 32.5K 76.1M … 76.3M 8 ( 3%) 0%
cache_references 2.45M ± 18.3K 2.42M … 2.52M 9 ( 4%) 0%
cache_misses 425K ± 3.03K 408K … 441K 9 ( 4%) 0%
branch_misses 347K ± 2.65K 342K … 359K 3 ( 1%) 0%
Benchmark 2 (217 runs): rustc --emit=obj main.rs
measurement mean ± σ min … max outliers delta
wall_time 23.0ms ± 1.09ms 21.1ms … 25.1ms 0 ( 0%) 💩+ 11.9% ± 1.2%
peak_rss 120MB ± 298KB 119MB … 121MB 0 ( 0%) 💩+ 24.1% ± 0.0%
cpu_cycles 52.3M ± 3.26M 45.2M … 58.9M 0 ( 0%) 💩+ 3.8% ± 1.0%
instructions 66.9M ± 20.3K 66.9M … 67.0M 10 ( 5%) ⚡- 12.1% ± 0.0%
cache_references 2.94M ± 14.1K 2.91M … 3.00M 4 ( 2%) 💩+ 20.1% ± 0.1%
cache_misses 694K ± 6.06K 681K … 718K 4 ( 2%) 💩+ 63.6% ± 0.2%
branch_misses 380K ± 1.97K 374K … 388K 6 ( 3%) 💩+ 9.4% ± 0.1%
Benchmark 3 (135 runs): zig build-obj main.zig
measurement mean ± σ min … max outliers delta
wall_time 37.2ms ± 4.58ms 27.3ms … 48.1ms 0 ( 0%) 💩+ 80.7% ± 3.1%
peak_rss 92.5MB ± 839KB 90.2MB … 94.1MB 0 ( 0%) ⚡- 4.2% ± 0.1%
cpu_cycles 41.1M ± 1.32M 38.0M … 44.3M 0 ( 0%) ⚡- 18.4% ± 0.7%
instructions 50.9M ± 7.44K 50.9M … 50.9M 0 ( 0%) ⚡- 33.1% ± 0.0%
cache_references 2.62M ± 28.7K 2.56M … 2.72M 1 ( 1%) 💩+ 7.1% ± 0.2%
cache_misses 491K ± 16.4K 457K … 538K 0 ( 0%) 💩+ 15.7% ± 0.5%
branch_misses 275K ± 2.47K 270K … 284K 2 ( 1%) ⚡- 20.8% ± 0.2%
Percentage-wise it still does not look good, but at this point we’re looking at differences of tens of milliseconds, so it starts to get into constant-time overhead territory.
Anyway, the main focus of the compiler development team right now is addressing this by introducing incremental compilation which means the compiler will no longer redo all the work building the standard library with successive compilations. With this in place it will become clear that the Zig compiler is indeed, quite fast.
26 Likes