Llama2.zig: Inference Llama2 in one file of pure Zig and a reflection on my first Zig project

Link to project: GitHub - cgbur/llama2.zig: Inference Llama 2 in one file of pure Zig

I’m super proud of my first project in Zig. It is based off of the llama2.c code that Andrej Karpathy wrote recently that runs inference for llama2 in a single file of C. I’ve tried to make the Zig as idiomatic as possible, but I am definitely still learning and would love feedback!

I’ve spent a fair amount of time optimizing the single threaded performance by using the @Vector feature where possible and some fused matrix multiplication using comptime magic.

I have little proof, but I believe currently this is the fastest, no dependencies, single file, single threaded llama2 inference code out there. Writing it in Zig was a breeze. Multi-threaded implementation to come in the future.

Other thoughts about Zig

I want to list some thoughts about Zig from a first time user and someone who has mostly written Rust for the past 5 years.

I really first got the itch to try this language a 2-3 months ago reading Alex Kladov’s excellent blog. A lot of the things he said in favor of Zig resonated with me, and then I went down a crazy rabbit hole learning about Data Oriented Design and watching cpp-con and Andrew Kelley talks. Learned a lot.

These are in no particular order.

  • I have few problems with Rust and continue to use it for my day job without issue. I am by far more productive in Rust than any other language I have tried. I like Zig for the use case of writing perfect, performant code. I hope to use it for more projects in the future. My day job is data compression and I think Zig would be a great fit for that.

  • Zigs simplicity is awesome. Rust is complex and I don’t like that. Zigs source code is really easy to read and understand. It has reawaken and removed my hesitation to dive into the implementation of std library code I use. I think this was driven somewhat by a lack of documentation. I often got stuck and could not find a way of doing things and relied on seeing test cases or other pieces of the std library to figure it out. I am spoiled by Rust’s documentation and think over time Zig will get there.

  • Comptime is simply amazing. I spent years writing Rust before ever writing a macro. And to this day I don’t like doing it. With little Zig experience I made a comptime function that generates fused matrix multiplication and it was easy, maintainable, and looks just like other code. Awesome awesome feature.

  • @Vector is one of the most compelling reasons for me to use Zig. I have never written proper SIMD code before but it looks arcane, the documentation assumes a lot of prior knowledge, and there are many different choices for each feature set, and is essentially assembly. Zig’s Vector is easy to read as it looks like normal Zig code, has nice implementations for common operations, worked like a charm, and gave huge performance wins. I hope this part of the language continues to get the love and attention it deserves.

  • std/compress/zstandard/types.zig is what convinced me I needed to learn this language. This is the most elegant format definition I have ever seen. Bless you @dweiller.

  • Getting started was really hard. Just getting an allocator or reading program args or trying to understand and modify build.zig was daunting and took a long time. I did Ziglings first, but its a different story when you sit down and need to write something on a blank page. More could be done to make onboarding easier. I got there eventually, but it is probably not for the faint of heart. But this is fine, I believe that instruments are made for people that can play them. Zig is a great example of this and it makes it powerful.

  • Tooling isn’t a deal breaker. ZLS is pretty good, but it’s missing some features. No big problems. Goto definition and find references work great. I think without goto definition for jumping into the source code of std library functions I would have wasted a lot of time. I never thought I would like coding without type hints, but I did and its fine. Copilot for Zig kind of sucks at this time and loves to suggest things that don’t exist or start writing Rust code.

Zig is awesome. Thank you to everyone who has worked on it and continues to do so. I hope to use it more in the future.

13 Likes

Hey! Congratulations. I’m looking forward to reading your implementation!

2 Likes

Great work. I have a similar story regarding Rust and Zig. I have also similar reasons why I lik Zig (and I wrote them down in my blog here).

1 Like