Comptime Garbage Collection

dude_the_builder · January 5, 2024, 3:29pm

I remember Andrew Kelley mentioning (I don’t remember exactly where and when) that the Zig compiler performs comptime garbage collection. It would be nice if someone could explain exactly what that is and how it works. I’m under the impression that this is what cleans up the intermediary arrays in code like this:

const std = @import("std");

fn getEvens(comptime n: usize) []const usize {
    if (n < 2) return &[0]usize{};

    var evens: []const usize = &.{};
    var i: usize = 2;

    while (i <= n) : (i += 1) {
        if (i % 2 == 0) evens = evens ++ [1]usize{i};
    }

    return evens;
}

pub fn main() !void {
    var evens = comptime getEvens(16);
    std.debug.print("evens: {any}\n", .{evens});
    evens = comptime getEvens(10);
    std.debug.print("evens: {any}\n", .{evens});
}

Specifically at evens = evens ++ [1]usize{i}; ?

z1fire · January 5, 2024, 3:32pm

I don’t have the answer but this is a good first use of the Explain category!

mlugg · January 6, 2024, 4:54am

Nice question, and one that’s going to require quite a bit of background. I’m gonna try and aim this response at a fairly general audience, so apologies if it’s a little verbose. (Credentials: core team member and active contributor speaking!)

So, arguably the most important stage of the Zig compiler pipeline is known as Sema. This stage is responsible for semantic analysis – that’s type checking, comptime code evaluation, and generating the IR which is sent to codegen. In the compiler, Sema is a type – found in src/Sema.zig – which contains all of the state necessary to perform this analysis. It has all kinds of fields for different things, but what we’re interested in here is how memory allocation happens.

The main allocator used by the Zig compiler is generally referred to as gpa (for “general purpose allocator”), and is either the libc allocator (std.heap.c_allocator) or std.heap.GeneralPurposeAllocator. From this “base” allocator, we can derive an arena allocator using std.heap.ArenaAllocator, which is basically a big blob of memory which can be freed all at once when you’re done with it. This pattern is used in several places in the compiler, because it’s a pretty easy memory management technique – dump all of your temporary state into an arena, and destroy the arena when the whole task is done.

Sema is no exception – each instance of Sema stores an arena: std.mem.Allocator field. This arena is used for any temporary data needed during semantic analysis – such allocations are thrown into the arena, which is then chucked out when this individual semantic analysis completes. A single analysis is either analyzing the value of a declaration (a global const or var), or is analyzing a runtime function body (type checking it and generating code). Any such analysis may call arbitrary comptime code: for instance, in your snippet, analysis of the runtime body of main evaluates getEvens(16) at comptime.

The next piece of the puzzle is how mutable and immutable memory is represented at comptime. Most comptime-known values are stored in a thing called the InternPool. This is a global store for a huge amount of data – most notably comptime-known values, but also interned strings, and some other weird stuff. (“Interning” is essentially deduplication here; rather than having a whole bunch of copies of the value @as(u8, 0), we use some clever datastructures to keep a unique copy of it, saving memory and allowing fast equality checks.) The InternPool aims to be a fairly memory-efficient representation of arbitrary values (most things are stored as a handful of u32 in a monolithic array – this is a common technique in modern DOD-style Zig code). On the other hand, we have mutable variables (which you might hear called “comptime-mutable memory”). These are not stored directly as interned values, largely for performance reasons: interning them would mean that mutating, say, a single element of an array would require reconstructing the entire array in the pool. Instead, we have a separate value representation optimized for mutating aggregates. This representation is able to live in the Sema arena, because a rule of Zig is that references to comptime-mutable memory are not allowed to “leak” into the value of a global or into runtime code (*).

Regarding interned values (those stored in the InternPool), the idea is that the pool can undergo periodic garbage collection using a simple conventional mark+sweep approach. However, this is currently vaporware, since the InternPool is a fairly new creation (more on this later). Thus, interned values are currently never cleaned up.

So, looking at your code, we note that our directly mutable memory (evens) is actually a slice, not an array. This means that the array this slice is referencing isn’t comptime-mutable memory, and thus uses the interned representation. That actually means that the temporary arrays here don’t undergo any garbage collection!

For sake of example, let’s pretend you instead wrote var evens: [n/2]usize = undefined and mutated that array. In this case, we directly mutate the array values using the un-interned representation, so we don’t have any garbage values persisting.

Now that I’ve explained this, it might reasonably seem pretty weak – it sure feels like a stretch to call this “garbage collection”. The reason for this is that we’re currently in a slightly bizarre transitional phase in the compiler’s evolution. The InternPool is a relatively new addition: a few months back, every value used what we now know as the “un-interned representation”, and all values were thrown into the Sema arena (and needed ones were duplicated again). This approach had a ton of problems, including excessive memory usage, slow lookups, messy datastructures, no serializability… but one advantage it had was that it allowed for this arena-based “garbage collection” to do a lot more. It’s probably possible to write contrived examples which leak memory in 0.11 (which had the InternPool) but did not in 0.10 (which did not). However, in practical cases, the new system usually still works better, because our shiny new interned representations are generally far more compact than our legacy ones were. This representation comes with several other advantages too, particularly concerning the compiler’s incremental compilation capabilities. In essence, we’ve regressed this “garbage collection” behavior somewhat in exchange for improvements in several other places.

We’re not done here – making comptime memory as efficient as possible is a planned area of future focus. Having trivial loops leak memory at comptime is unfortunate, and we want to improve on this. In a similar vein, we have a lot of work to do on the performance of comptime code execution. However, right now, our focuses lie elsewhere: stabilizing the language, progressing self-hosted codegen backends, implementing incremental compilation, etc.

(*): okay, this is supposed to be a rule. The relevant compile error isn’t yet implemented, and something is broken somewhere, so we currently do some cheatey stuff to make comptime-mutable memory remain valid even after we are meant to be done with it. We’ll fix this soon!

gonzo · January 6, 2024, 8:07am

This is an amazingly complete and detailed answer – thanks @mlugg !

tensorush · January 6, 2024, 8:57am

Fantastic! Thanks for this write-up on the current state of things!

dude_the_builder · January 6, 2024, 1:03pm

Wow! Super interesting and fascinating how you guys are evolving the Zig compiler in so many ways. I really appreciate you taking the time to share this knowledge with the community. Cheers!