How Zig incremental compilation is implemented internally?

mlugg · March 13, 2024, 1:00pm

What’s an “IES”?

Oops, sorry! IES stands for “inferred error set”. When a function has an inferred error set (fn foo() !void { ... }), then code which e.g. performs a switch on that error set has to “resolve” it: figure out the concrete errors it contains. To do that, we have to analyze the function body if we haven’t already so that we encounter all ZIR that could return errors, which implicitly adds them to the IES.

How “values” work? […]

Yep, your understanding is totally right, and types and functions are indeed also values which we do store in the InternPool. The basic way this works is that every value (including types) we create is assigned a 32-bit “index” into the pool; this index points into an items array, which for more complex values may also reference some data stored separately. When we want to add a value, there are some efficient datastructures centered around ArrayHashMap that allow us to quickly find the value in items and return that existing index if it’s already there; if it’s not there, we add the new item onto the end.

I guess one subquestion is how do we determine “sameness” of values, the question of equality and identity. […]

The notion of uniqueness for interned values loosely aligns with “equality” in the Zig sense. However, the example you’ve given of moving a struct declaration gets a bit deeper into the weeds.

Let’s say that on an incremental update, we move a const S = struct { ... }; declaration down one line(*). Then, the source code of that declaration itself is perfectly unchanged – we’re just considering the bytes from the const to the ;, and they haven’t changed. So, if there are no other changes, the dependencies of the Decl corresponding to S would not be invalidated. But instead, let’s suppose that we, say, added a newline between = and struct, so that the source bytes have actually changed, even if not meaningfully. Then, the src_hash dependency of the Decl corresponding to S would be invalidated, so we would re-analyze the ZIR instruction which declares a struct.

Now we have to introduce something new. As well as the Decl we already have corresponding to S, types have this thing called an “owner Decl”. It doesn’t correspond to a source declaration as such, but it’s the context in which we perform resolution of the type (to allow self-reference, type resolution happens lazily, in a few stages). That Decl contains all dependencies which arise from type resolution - for instance, if the field types of a struct make reference to a declaration, the struct’s owner Decl will have a decl_val dependency on whatever was referenced.

When we re-analyze a struct_decl ZIR instruction, we look up in the InternPool the type which exists at the same AST node (in reality we track it based on the ZIR instruction index) and with the same set of captured values. This is #18816 – structs are considered equivalent if they live at the same source location and reference all the same external values. However, if this lookup succeeds, we aren’t quite done: we need to check if the owner Decl of the type is marked as outdated.

If it’s not, everything is fine – we reuse the existing type. However, if it’s outdated, that means the structure of this type (something to do with its fields or memory layout) has changed. That means, in essence, that every use site that used this struct needs to be analyzed again – so we need a “new” type, with a fresh InternPool index to make sure it’s considered distinct. Therefore, we remove the old type from the pool (the item is replaced with a dummy removed value which lookups will never match) and create a new one (with a new owner Decl to match). Then, when analysis of S finishes, its resolved value will ultimately be this “new” type, so any dependencies on S are invalidated and re-analyzed, etc.

This system is a bit weird and subtle, and will get at least a little easier to understand after the proposed internal reworks I linked to above (which have since been greenlit by Andrew) go through, but I hope this made some sense!

*: a column is a bit weirder because of some details of how we track source locations internally for debug info, but I don’t feel like getting into that now

What happens in this scenario: […]

When we’re performing an incremental update, we don’t have a good way to immediately tell if things will become unreferenced. Of course, in this example it’s fairly obvious in theory, but you can imagine a case where it’s not; start with this:

export const a: u32 = b;
const b = 123;

…and change it to this:

export const a: u32 = 123;
const b = @compileError("bad");

Here, re-analyzing b means we’ll emit the compile error, but it shouldn’t be referenced! But it’s quite hard for us to know that ahead of time.

So, on an incremental update, we re-analyze anything that we’ve previously analyzed which could be referenced. How do we deal with things becoming unreferenced?

The answer is currently vaporware, but I’ve done some preemptive work on it. The idea is that after all analysis is complete, we will perform a graph traversal on a big table of references between Decls and runtime functions. These are kind of like dependencies, but we have different constraints on what accesses need to be fast on the datastructure and the entries are a little different, so they’ll be stored separately. Anything we reach in this traversal is “alive”: it needs to be in the binary, and any of those things that failed should emit their respective compile errors. However, anything that we don’t reach we know to be unreferenced, so we can just ignore those compile errors! We could also instruct codegen/linker to remove it from the binary if it’s there, but – at least for Debug builds, which are the primary focus of incremental compilation – that’s not strictly necessary.

So, let’s consider your change. Here’s what happens on the rebuild:

Some src_hash dependencies are invalidated which results in a bunch of re-analysis. If any test was modified, it is re-analyzed and emitted to the binary regardless. If re-analysis of the test now emits a compile error, we save it for later.
We begin traversing the graph of references between Decls and functions. When we encounter a namespace type (struct etc) which is referenced, we traverse its namespace to figure out which Decls are “implicitly” referenced. Notably, that includes any tests which pass the test filter. So here, we will exclude any tests not matching the filter from the set of things reachable in the traversal.
Some tests are now unreferenced. We don’t include them in the test functions passed to the test runner. If this update introduced compile errors in a test function which is now known to be unreferenced, we silently ignore that error – although we do keep it stored for future incremental updates, since if a future update makes the test referenced again, we can just use the error message that we’ve already found!

I hope that all makes sense!

What is the “granularity” of incrementality? […]

You’ve probably figured out from some of the answers above that yes, as you suggested, the most “atomic” thing is the analysis of an entire Decl or runtime function body. In the example you gave, we would indeed recompute the entire call. This is arguably something of a weakness of the system; our saving grace is that even today, the compiler is pretty damn fast (ignoring LLVM). Assuming you’re not horribly abusing comptime, individual analysis of functions or Decls is generally gonna happen on the scale of milliseconds. The granularity we have is a nice middle ground between compiler complexity and useful incrementality.