Zig and AI coding

Let’s try to keep a Zig angle here. With TypeScript AI can do quite a lot, with Zig out of the box it easily hallucinates or mixes different versions. Surprisingly simple fixes and resulting positive experiences have just been reported.

I’m currently wondering about the computational cost of pointing agentic AI to Zig documentation or standard library source code. Deepseek 3.1 or other Chinese open models would run locally on a $10k Mac were it not currently cheaper to use cloud services, but that agentic approach might not make sense if cloud subscription prices go up (all the services seem to be losing money) because waiting for it to call tools multiple times, parse the output and spend a bunch of thinking tokens costs someone money or time.

Is this sustainable and can we somehow do better while still using the latest Zig release? Or can we get still more bang for the buck from the agentic approach?

1 Like

I’ve had a good experience with Claude and Zig. I tend to operate more in a “high level engineering mode” now instead of constantly being in the weeds with syntax. I do babysit the assistant and read every change, give guidance, discussion of approaches, branching and merging, and planning. There are some common mistakes Claude makes because of changes in 0.15.2. When I see mistakes that are repeated, I have Claude write a note about it in a cheatsheet, then @include it in context when needed. Overall, the zig approach to dev, especially leveraging the entire c/c++ ecosystem (which has a lot of training data) has made coding way better for me. Also, the fact that Zig is a hermetic environment, with all the source for the stdlib right where you’re using it is great for the assistant.

At the end are a couple examples of how Claude helped me create builds for a couple of what I would consider sizable projects. I would certainly not have known all the fiddly flags, #defines, and syntax to do this without a lot of googling, but Claude already knows all this and just spits out the right build changes by looking at the build errors. Converting code that is tied to MSVC to work with zig cc is something I’ve found especially cool.

As an experiment, try asking your assistant to analyze all of the allocations in your program, and identify opportunities to use arena allocators. This type of refactor is fairly trivial for an assistant, and has great benefits.

Cheers.

Aaron

5 Likes

I don’t code professionally so I cannot judge how useful current versions of LLM are for large project. But I made a living as a programmer for a decade until 2010. Whatever I made during that 10 year, can easily be made by current frontier models, judging from the few hobby projects I made with them.

You can argue that they don’t write the best code. But how many people actually write code in this business are not also slop? You can make an argument that the environmental cost is too much. You can it takes away the joy of coding. But dismiss them as just creating slops sounds like someone who has never used them.

2 Likes

I’m not convinced using LLMs to generate significant amounts of Zig code helps the ecosystem but the industry is going this way and I’m sure it won’t slow down with future generations…

Perhaps if there is a push by parts of the community it would be worth these people maintaining some guiding principles for those who do lean more into LLMs, with the goal to help shape positive LLM based outcomes and to discourage certain negative LLM based behaviours (e.g., AI drive by pull requests to projects clearly not accepting AI contributions).

A principle to start on could be around clarity. e.g., All AI generated code is labelled as AI generated. This could then lean into a point about respect. e.g., Do not request AI generated changes to a project that’s not labelled as AI generated. (I also imagine something around understanding / human reviewing / learning. i.e., Ideally the LLM doesn’t take away our ability to understand what it produces)

With all that said, I think the dollars and compute spent on an LLM to generate Zig code would be better donated to Zig foundation, Zig tools or some other open source contributor to the Zig ecosystem. But I understand the industry is changing and maybe I’m now that stubborn old man yelling at clouds…

4 Likes

I use LLMs as a substitute for human consultants when they need help.

Overall, LLMs are not very reliable, but the degree of unreliability is about the same as that of other humans. Many times, humans are often even less reliable. Generally speaking, the most trustworthy person is yourself, followed by professional humans, then AI, and then your other colleagues.

The cost of communicating with LLMs is lower than that of humans (in terms of timeliness). But the person with the highest level of trust is always myself. I will use the content output by the LLM, but it must be strictly reviewed by myself. However, many counterexamples of using LLMs in reality are that the users of LLMs themselves cannot confirm exactly what the output content means.

3 Likes

This is not possible. If you are not familiar with AI assisted programming, it’s like pair programming with a fairly knowledgeable junior developer. It’s an interactive process. The line what is your code and what is AI code is blurred. If I come up with some amazing idea, create the rough structure of the algorithm and let Claude Code to finish it, who really created it?

What you described in my mind is AI generated code.

If I design a system and another person writes the code for it then it’s written by that other person. The code is their creative work and the design is my creative work.

I see your point though, it can be blurry and ranges in significance. e.g., if I ask it to refactor a switch to if else or to split a module in to it’ll produce what I’d be writing verbatim.

A principle could be more open ended “if you use AI/LLM in any part of the process of producing open source code then it should be appropriately disclosed (refactoring, brain storming, research, test authorship, complete implementation, etc)”

My main point for the clarity example (maybe not the best word choice) is that people are rightfully concerned about AI use. Concerns may be legal, ethical or an assumption of quality. Not disclosing doesn’t help calm concerns and can break trust in communities but high quality projects proudly disclosing their AI use will.

3 Likes

I think the problem with LLM contributions to large long running projects is that it shifts the balance of work from the people writing code, to the people reviewing and refactoring code.

It is very easy to use an LLM to generate code that is almost correct. It is very hard to read almost-correct code you didn’t write or participate in designing, and find bugs in it. If you do find a flaw during review and you point it out, you can’t make the LLM “learn” this the way you could instruct a junior developer. There’s no guarantee the future contributions from that person will be better, if they aren’t the person writing the code.

The people doing a lot of review are often skilled core members who are intimately familiar with large parts of the codebase. They are the most valuable people working on the project. You’re increasing their workload, while decreasing your ability to upskill the people who report to them.

In that way it’s very easy for LLMs to be a net negative.

20 Likes

I think you’re spot on in your entire post, you really hit the core issue. LLMs can drain time and happiness from the maintainers of a project, which is the exact opposite of what you want.

It really boils down to how you use the AI models. If you create a framework and tell the current LLMs to fill out the details what you’ll get is slop, no question about it… it will indeed have subtle very hard to spot errors that can cause a lot of headache. However if you look past the initial slop and actually read it with intent to understand it and take ownership of the code. LLMs will let you hit the ground running, most likely all the correct code is there, but most likely it’s not put together just right, so it’s not a substitute for actual work. You still need to understand memory safety, threading and error handling, etc. If you skip this part and don’t even take the time to understand it and own it and then continue to submit it for someone else to deal with. Now then you’re actually hurting that project by draining time and effort from key members and that is worth instant warning/block/mute/ban imo.

When generating zig code Claude typically does a fairly good job, it does the trivial things like “defer free” when appropriate and usually make deinit free the right things to clean up. But it delivers a very naive implementation, now this is perfect in the sense that you don’t have to waste a ton of time worrying about if it’s called “writer” or “sink”. It will typically always just toss try infront of any function that can fail and bubble up without considering if that is the right choice. It will always use exactly one allocator even when a situation clearly begs for at least two. Like if you allocate both temp data and permanent data for each iteration of a long loop. In short if you take the time to use it as inspiration and api lookup rather than a fixed working solution it can save you a lot of time.

There are two things LLMs does exceptionally well. Solve well known and well defined problems that are not super time critical. Ask it to make a digital shopping cart and it will most likely one shot it, complete with VAT calculations and everything. It can draw on sources in other languages without problems, so even if it’s never seen a shopping cart in Zig that wont bother it one bit! Because there is plenty of data on how digital shopping carts work, it’s a well known and well defined problem. Again you’ll get a naive implementation that probably does a lot more allocations than strictly needed, but unless you’re amazon that’s not the optimization you’re looking for, it will probably work and be reasonable performant.

The second thing where LLMs currently shine is when you have a crash incident. I typically just give it the log file and a few relevant code files. I found that Claude can quite often explain the error to you within twenty seconds and give you both the explanation of the problem and a small concrete fix. Upon reviewing this information and confirming that it does indeed match the data it’s typically safe to apply that fix. I have saved MANY hours and many frustrations by just giving Claude the data and have it digest it for me. So far Claude have not led me astray, if it couldn’t find a smoking gun it will ask for more question and to date it has correctly identified (but not always provided valid fixes) for all problems that I have thrown at it.

Sorry to disagree, it really depends on the training data distribution and what it has seen in real github repos (and real code converstations which they are collecting).

LLMs understanding is very shallow - for example if you ask latest Opus 4.5 about vdom diffing algo in Vue2 (which is fairly old and very popular project, and a simple concept in practice) it will correctly answer with some pseudo-code, but if you add a follow up question, ie. “could you also do a pseudo-code if we didn’t have to do Xxx” - then it will just spit out random gibberish. It clearly does not understand some basic concepts & motivations. It also happened to me with data-structures in Zig and I’m way less experienced in that area to be able to compare it critically.

LLM slop is real. Anything you (zero-shot) generate requires careful review or it should be either rejected or put straight into the “tech debt” pile. On the other hand, if your task is mechanical, LLMs can easily make you 10x productive. But I would advise against consulting with LLMs, because it just never worked for me, and I was always regretting that afterwards, it’s just not there yet.

3 Likes

All these things are extremely specific. It depends on your knowledge and how much are you foolishly trusting the LLM.

I’ll give you a contra-argument. I was recently debugging an issue where my coroutine switching code was crashing on macOS in release mode. It was a really deep problem, so LLM was of no use directly, it was just running in circles.

However, it was EXTREMELY good at guiding me while I stepped through the aarch64 assembly via QEMU debugger (I have experience with x86_64 assembly, not much with aarch64). If I was alone in this, I’d have not finished the debugging session. It was guiding me what to watch for, explaining why is the compiler doing this kind of assembly, etc.

In the end, it was a problem with the LLVM code, it was silently ignoring some settings. Once I had that knowledge, I used Claude Code to find the problem in LLVM, compare it to the RISC V backend which has it handled correctly. Ended up opening a LLVM pull request, with the changes started by Claude Code and the rest of the pull request finished by me (tests, etc). And they were thankful for the change, because it doesn’t matter if I found it using AI or not, it was a problem and I could have decided to just work it around (much like the Rust compiler did), but instead I went ahead and fixed it upstream.

None of this would be possible without Claude Code. I just don’t have the time/energy I had 20 years ago.

I have a feeling people are conflating AI assisted programming with vibe coding and completely ignoring the fact that it’s a tool. And it’s a tool that can be used very effectively by people who can actually read/write code quite well.

11 Likes

Perfect example, with your rich coding background what you lacked was domain knowledge. LLMs wont do all the thinking or the complex work for you, but if you have coding experience and ask the right questions, you get very solid answers. Answers that without the right domain knowledge could take you a day or more to arrive at on your own.

And once the problem has been identified, the LLM will have all the right information in it’s context window to craft a fix and even explain the why and the how of the fix. So even if the fix might be technically wrong sometimes, it usually have all the right pieces in it.

I agree 100%.

1 Like

On the general question of LLM usage for coding, I think it’s all up in the air at the moment and changing quickly. My daughter used replit (vibe coding, or what I would call no-code) to create a very good web app for her new company. She’s not a coder and never looks at the code, but is very determined and takes responsibility for testing the behavior of the app. I was very apprehensive about it, but it’s going surprisingly well. I think that’s probably because the generated code is typescript/react and the database is Postgres, so the models are probably as good as they can be. It will be very interesting to see how it goes.

The more specific questions are whether LLMs currently have a place in Zig coding, and also whether it is Ok to post LLM generated code on the forum (this is currently not allowed). Since the models don’t currently do a good job with Zig code in particular, especially for the current version of Zig, it seems clear that it applies less to Zig. But that will change over time and I don’t see how it’s wrong/bad/etc to use it, as long as you take responsibility for the output.

Posting generated code on the forum is going to make it annoying to answer questions, since you can’t tell anything about what the poster knows, or doesn’t know, from looking at the code. So attributing it to LLMs should absolutely be required IMO. But beyond that I’m not sure how it helps to have rules. People are going to do it, whatever the rules, and it’s going to be difficult to identify generated cod reliably. I’m not sure it’s possible to put up a fence and keep it out, although I really do understand the desire to do that.

PS. I realized that the concern about identifying generated code applies to a rule that people attribute it to LLMs. So I’m not sure what rules can work.

1 Like

I think the issue with AI and why it shouldn’t be allowed without full disclosure, is that posting AI generated slop is disrespectful for the members of the community, and quite selfish. It’s ok to use Ai for coding, even doing it in Zig, it’s fine for little things, no denying that. But when you have someone using Zig or any language for that matter, not understanding it, and basically using humans as fail over to circumvent their lack of ability to query the AI to fix it for them, i think it misses the point of a forum, I’m always happy to help people to the best of my ability, but if all I’m doing is really explaining to the AI the person is using what’s wrong, that doesn’t feel like a human interaction to me, and more like we are using me and taking advantage of my kindness. I’m quite sure many here would feel the same.

So it needs to be disclosed, such that at least people are aware, and I think people should refrain from using AI until they have enough knowledge that they could technically do it themselves if they took the time.

For me this forum is really when you don’t understand, or can’t find the solution on your own, or you need something explained to you, or you want to talk about something, share some cool project, get feedback.

15 Likes

I don’t like limiting ideas and education by telling people which code is approved to ask for help with. Maybe that one answer to fix “slop” gets folded into the training data and then it’s fixed for everyone forever after. Code is code. Why the elitism?

2 Likes

You have a good point. But I think/hope it’s not elitism but rather the unwillingness to spend time helping to train models, and implicitly encouraging people to use the models, when you don’t believe they’re valuable. If the generated code is disclosed, people can make their own decision.

I find it disturbing that we cannot have a Thread about HOW to properly use ai, without being derailed with stories on how others have done it poorly in the past. There are plenty examples of threads with ai that broke forum rules and was closed, the moderation is working.

Please stay on the OP constructive track of beneficial ways to use ai with zig. You’re free to be skeptical but that belongs in another thread.

4 Likes

I don’t want to be elitist, I’m happy to provide help to everyone genuinely interested in learning, I just don’t want to spend my time on someone’s code if they didn’t even try to solve it by themselves using their brain. To be clear even if someone wasn’t using AI but it was clear that they didn’t even bother trying first I would criticize them the same way.

I think it’s only fair when you come to any community to make an effort of trying to solve your problem yourself before reaching out. I don’t want this to look like elitism

7 Likes
1 Like

Here’s a very concrete data point. I gave Gemini Pro this exact prompt:

Could you translate this code to Zig 0.15?
<My code copypasted from https://github.com/at-lib/sixel/blob/main/sixel.ts>

After changing 3 var keywords to const, it compiled. Then I asked:

Could you then convert this into a test block for it?
<My code copypasted from https://github.com/at-lib/sixel/blob/main/sixel.test.ts>

It provided a main function instead of a test block. After replacing it, the test runs perfectly producing the exact same output.

I will have to review it manually anyway just to be sure, and there’s always some room for extra comments and improvements, but this sure saved a lot of time porting code that was already written with a lot of care and love in another language. I’m surprised Gemini made no errors or omissions at all in the algorithms.

This suggests to me a viable future strategy to keep developing this kind of algorithms in TypeScript (when used in web app frontends I prefer that instead of binary Wasm blobs) and then porting to Zig for backend use.

The library itself draws an image buffer from memory straight to the terminal using DEC Sixel graphics, which is very nice when working with graphics algorithms on Zig. No need to save the image to a file or interface with GUI libraries.

1 Like