Hi everyone,
I’m trying to create a new language and was hoping to understand some things about Zig’s compiler internals.
Specifically I’d like to know: How does Zig compile generics so quickly?
In most other modern languages like C++, Rust, Swift, etc. generics are known to slow down compilation times, and yet in Zig they don’t seem to have as much of an adverse effect.
One of the major reasons could be that Zig uses it’s own custom x86 ASM backend, instead of LLVM, so it’s expected to be faster than other languages which depend solely on LLVM.
However that’s not the answer that I’m trying to find. The greatest cost of generics comes from monomorphisation where new artifacts are created by instantiating the generics. And by definition, that is slow, because this causes lot of code duplication, and bloat.
Languages such as C++/Rust which often decide to monomorphise a lot, suffer from slow compilations. In C++, it’s significantly slower, because templates need to be instantiated in each compilation unit (source file). Rust is slightly better because it compiles at a crate scope. So the Rust compiler can optimize and reduce the number of instantiations required.
I would request to know what such similar optimization technique Zig compiler uses to compile generics quickly. Specifically because Zig does lot more complex things than other languages, (such as comptime
), where code can be run during compilation, and is expected to be slow.
However it isn’t slow. I haven’t measured in detail, but somehow by my annecdotal observation alone, Zig compiles fast.
The only optimizations that I think Zig might be doing, are,
- Reducing the number of instantiations by “caching” previous instantiations. But that is limited by RAM size, and for extremely large projects (similar to Google Chrome browser) it might be difficult for the compiler to search and find every instantiation.
- Reducing the number of searches for previous instantiations, by “caching” the results of these searches themselves, and using them to quickly create new objects.
- Using a deterministic name-mangling algorithm where every instantiation of any generic (say
ArrayList(f64)
) would create a deterministic mangled name like,ArrayList_fhUehyG_blah_blah_f64
so the compiler could assume that such an instantiaion previously was created, or will be created in future, and just directly links to it, instead of wasting time compiling it again.
Is there any other older thread or resources where I can find a more definitive answers?
Thank you