Comptime-Mutable Memory Changes

Hi everyone! Recently, PR #19414 was merged, which more-or-less rewrote the compiler’s internal implementation of comptime var. This rewrite comes with some language changes, so I thought I’d explain the user-facing impact of this PR here so that people know what’s going on when they inevitably get errors

TL;DR

A pointer to a comptime var is never allowed to become runtime-known, to be contained within the resolved value of a declaration, or to be otherwise referenced by analysis of any declaration other than the one which created it.

The most likely way for this to manifest is in a function computing a slice at comptime by filling an array. To fix the error, copy your finalized array to a const before taking a pointer.

If you used the previous semantics to create global mutable comptime state, your code is broken. This use case is not and will not be supported by Zig – please represent your state locally.

If that all made sense, there you go – you can go and spend your time on something better! Otherwise, read on and I’ll explain this in more depth.

The Long Version

Zig supports the concept of a comptime var: a mutable, comptime-known, local variable. These can be created with explicit comptime var syntax, or, if a function is evaluated at comptime, all vars within it are implicitly comptime.

As of commit 4055022, there are some restrictions in place on the usage of comptime var.

Firstly, a pointer to a comptime var is never allowed to become runtime-known. For instance, consider this code:

test "runtime-known comptime var pointer" {
    comptime var x: u32 = 123;
    // `var` makes `ptr` runtime-known
    var ptr: *const u32 = undefined;
    ptr = &x;
    if (ptr.* != 123) return error.TestFailed;
}

This code previously worked as you might expect. Now, it is a compile error, because the assignment to ptr makes the value &x – which is a pointer to a comptime var – runtime-known.

Such pointers can also become runtime-known by, for instance, being passed to a function:

fn load(ptr: *const u32) u32 {
    return ptr.*
}
test "comptime var pointer as runtime parameter" {
    comptime var x: u32 = 123;
    if (load(&x) != 123) return error.TestFailed;
}

This test also emits a compile error. The call to load occurs at runtime, and its ptr argument is not marked comptime, so ptr is runtime-known within the body of load. This means that the call to load makes the pointer &x runtime-known, hence the compile error.

The second rule is that a pointer to a comptime var cannot be contained within the resolved value of a container-level const or var declaration. Intuitively, the comptime var is not allowed to “leak out” of the declaration that created it. Here is an example which violates this rule:

const ptr: *const u32 = ptr: {
    var x: u32 = 123;
    break :ptr &x;
};
comptime {
    _ = ptr;
}

Attempting to compile this code will emit a compile error, because the global value ptr contains a pointer to the comptime var named x. This can manifest in much subtler ways, such as nested pointers and/or datastructured:

const S = struct { counter: *const u32 };
const ptr: *const S = ptr: {
    var x: u32 = 123;
    const s: S = .{ .counter = &x };
    break :ptr &s;
};
comptime {
    _ = ptr;
}

The same error occurs here: the value of the global ptr contains the value &s, from which we can access s, which contains the value &x, from which we can access x. Note that it doesn’t matter that there is a const in the mix, nor that all pointers involved are marked const: if there is any accessible reference to a comptime var, the value is invalid.

This rule is the most common way this error will manifest. It comes up often in comptime code constructing slices:

pub const my_name: []const u8 = name: {
    var buf: [5]u8 = undefined;
    // In practice there'd be some complex logic here to set up `buf`!
    @memcpy(&buf, "mlugg");
    break :name &buf;
};

Analysis of this declaration fails just like the above, for the exact same reason: the slice value, even though marked const, contains a reference to the comptime var named buf. The solution here is to copy the finalized data into a const, to avoid any references to a comptime var:

pub const my_name: []const u8 = name: {
    var buf: [5]u8 = undefined;
    @memcpy(&buf, "mlugg");
    const final = buf;
    break :name &final;
};

There’s one more context in which errors can now be raised, but it’s fairly esoteric, so I won’t go into detail: but if a container type captures a pointer to a comptime var from an outer scope, the error will also be emitted. For instance:

pub const x: u32 = blk: {
    var unused = "hello";
    _ = struct {
        comptime {
            // This reference causes this `struct` to capture
            // a pointer to `unused`.
            _ = &unused;
        }
    };
    break :blk 123;
};

Despite the fact that unused is not in any way referenced by the final value of x, and that the struct type is never even used, its creation here raises a compile error by capturing a pointer to a comptime var. If you do happen to hit this, chances are you don’t need to capture by pointer – try just accessing the variable directly!

What about my global mutable comptime state?

Previously, you could get a global pointer to comptime memory and use it to make, for instance, a global counter:

const counter: *u32 = c: {
    var x: u32 = 0;
    break :c &x;
};

This code is not correct. It will not work, and Zig does not and will not have any method for global comptime-mutable state. Please represent your state locally.

Why?

At first glance, these changes seem a little arbitrary and unjustified: however, they are important for reasons relating to both language soundness and compiler design.

Firstly, language soundness. Before this change, it was possible for a pointer to a comptime var to be known at runtime. If you’re just loading from the pointer, that isn’t necessarily catastrophic: although it’s still not great, because you’ll only be able to read the “final” value of the comptime var. But even worse, what if you try to store to the pointer? Well, I can tell you: the answer is that shit breaks. Here’s one example of an old issue where someone completely overlooked this issue – completely understandably, because within Zig’s type system the invalid code was considered fine. These recent changes mean that it’s impossible for a pointer to a comptime var to be used at runtime, avoiding this issue entirely.

Secondly, compiler design. It’s no secret that I’ve been doing a lot of work on incremental compilation lately: this is likely to be one of the most significant features, not necessarily of Zig as a language, but of its compiler. Incremental compilation relies on the fact that global declarations are largely independent of one another, and their interdependencies can be easily modeled (for instance, one global declaration might depend on the resolved value of another). We also depend on the fact that the order of analysis of different declarations is (aside from some awkward edge cases, which eventually will either be ironed out or defined as implementation-defined behavior) unobservable. Unfortunately, global comptime state breaks both of these promises. Loads and stores of this state depend entirely on analysis order, and re-analyzing a declaration may mutate global state in ways which would not happen without incremental compilation. In short, the previous behavior was completely incompatible with incremental compilation.

As a nice bonus, this change has given way to some huge cleanups of internal compiler datastructures. This will almost certainly lead to user-facing improvements: bugfixes, performance improvements, and eventually, incremental compilation.

The road to incremental is long and winding, but we’re marching it nonetheless :slight_smile:

44 Likes

Thank you so much @mlugg for taking the time to explain these changes in detail. This type of information is of inmense value to the community since we’re all surfing the wild wave that is the constantly evolving Zig compiler!

6 Likes

Most interesting read, thanks!

To further my understanding, is it correct to say that the two changes are mostly orthogonal?

The change about about not leaking pointers to declarations is essential for incremental. It ensures that all memory transitively reachable from a (resolved) declaration value is truly immutable, which is absolutely required for incremental, and for avoiding hard-coding compilation order into the reference.

The change about not leaking pointers to runtime is not essential. We could allow that, it doesn’t break the language in fundamental ways, but it is just not a useful semantics. At minimum you’d want some sort of a lint that allows leaking only const pointers to var comptime data, but, at that point, it’s easier to just forbid the whole thing thing and require the user to copy to const manually. This does lose a bit of expressively, where we can take a pointer first, and compute comptime value after that, but, given that all comptime vars are now local anyway, this is probably a very minor loss, if any.

2 Likes

I suppose this gets a bit philosophical: what is “breaking the language in a fundamental way”? To me, being able to write completely correct code, but then just making one thing in it runtime-known and having it result in a runtime crash is indeed a fundamental breakage. The type system falsely confirming mutability is a fundamental breakage: it’s basically failing to do the one thing a type system is meant to do.

But yes, you could conceptualise a version of Zig which does not have this restriction, it just has some rather insane semantics, whilst the first change is strictly necessarily for incremental (under our current model thereof).

2 Likes

@mlugg appreciate the explanation for this change.

Is there a better way to get info on where a comptime var might be referenced? This change is giving me a “global variable contains reference to comptime var” error for one my global constants. However, the comptime var must be in a function call somewhere that’s being called to set a field on the constant, making it difficult to find.

I’m afraid there’s currently nothing great, improving the error is on my to-do list. You could maybe try @compileLog on the value – if anything says (comptime alloc) (this is actually an internal detail leaking out due to incomplete value printing code lol) then that’s either a pointer to a comptime var or a pointer to something that itself contains a comptime var. This is a clunky UX, and literally relies on a TODO in the compiler, sorry about that – I’ll get around to it soon!

1 Like

Thanks!

Funnily enough I found the last one as you posted this (I had around 10 such references scattered throughout my library). They’re all fixed now using the const duplication method you detailed above.

is that already running in build night

Are you asking if this new comptime var paradigm is in the latest Zig? If so, then yes. I updated my code on v0.12-dev3457 which is the latest I believe.

1 Like

ZLS no longer compiles with the latest master

I strongly appreciate the clarity this change brings, this turned a couple of my compiler crashes into proper errors :grinning:

I’m confused about one thing though: Vexu pointed out in his reply to my bug here that I can ensure a value is runtime rather than comptime (dodging the new error) by marking this function inline: Crash iterating over tuple elements via pointer in comptime · Issue #19428 · ziglang/zig · GitHub

I wrote another version that iterates the field info of the tuple declaration with std.meta.fields() so I would be returning a pointer to the const tuple decl’s element rather than whatever was captured in a loop, but I still hit the “runtime value contains reference to comptime var” error.

My intuition was that returning the address of a decl in a struct wouldn’t be a comptime var, but clearly that’s not the case. What’s a good mental model to apply here?

Ah, this is an awkward thing to do with comptime fields – I’m hoping we’ll change the language to fix this case. The issue is that ATTRIBUTES is considered to be a tuple with comptime fields, since it’s an untyped array init with comptime-known field values. Comptime fields are weird enough that they act much like comptime-mutable memory, which is genuinely very unfortunate, and ought to be changed.

Rest assured that your mental model is fine – the language is flawed here, and my error message isn’t precise enough. Sorry!

To work around this, just capture attrib by value in your loop and take a pointer to it inside the loop if you need to. (In case you didn’t know, &expr where expr is comptime-known yields a pointer with infinite lifetime!)

2 Likes

Ahh gotcha, glad my mental model was sensical, but your explanation of how the untyped array init leads to comptime fields also makes sense with all of that additional context.

I have totally (ab)used this in other contexts but hadn’t thought to apply it here. thanks!

It usually takes some time for projects to catch up with changes to the language or standard library if they’re tracking master, since someone has to notice that the changes affect their project and make the necessary adjustments.

2 Likes

wow, 12 hours later everything is back to normal.
ZLS master is functional

excellent

1 Like

Is it not possible to mark the comptime var as const and “freeze” it whenever it becomes accessible from a container-level const or var declaration, or a runtime-known pointer?

That seems like it would avoid the same semantic pitfalls, without requiring the user to manually (recursively) copy values to a const. At least, that would save the trouble for many simple cases where the comptime var does not escape or is not used again.

hello

const std = @import("std");

// The cook asks how many eggs you have in your basket

const alloc = std.heap.page_allocator;
pub fn main() !void {
    var args_iter = std.process.args();
    _ = args_iter.next(); // skip program name
    const second_arg = args_iter.next() orelse "0";

    try check_my_basket(second_arg);

    try go_fetch_eggs();

    the_problem_is_the_cook_needs_to_be_sure_of_the_number_of_eggs();
}

fn check_my_basket(eggs: []const u8) !void {
    const cook = try std.fmt.allocPrint(alloc, "I look in my basket, I have {d} eggs", .{try std.fmt.parseInt(usize, eggs, 10)});

    std.debug.print("answer I'm not sure: {s}\n", .{cook});
    // std.debug.print("answer I check: {s}\n", .{test_eggs(try std.fmt.parseInt(usize, eggs, 10))});
    // I cannot test, because the function is comptime
}

fn go_fetch_eggs() !void {
    const cook = try std.fmt.allocPrint(alloc, "I need to go fetch eggs, I have {d} eggs", .{how_many_eggs()});

    std.debug.print("answer I'm not sure but?: {s}\n", .{cook});

    // std.debug.print("answer I check: {s}\n", .{test_eggs(how_many_eggs())});
}

fn how_many_eggs() usize {
    // 20 hens laid 24 eggs
    const num_hens: usize = 24;
    return num_hens;
}

fn the_problem_is_the_cook_needs_to_be_sure_of_the_number_of_eggs() void {
    // he's clever because he only has 20 hens?
    const num_eggs: usize = 24;

    std.debug.print("answer I check: {}\n", .{test_eggs(num_eggs)});
    const cook = std.fmt.comptimePrint("I have {d} eggs", .{num_eggs});
    // why comptime because his kitchen is immutable to make a bacon omelette for 12 people
    std.debug.print("answer I'm sure I can make the omelette for 12 people: {s}\n", .{cook});
}

fn test_eggs(comptime n: usize) bool {
    if( n < 24 ) return false else return true;
}

In Zig, there are many functions that enable modeling or metaprogramming, but comptime escapes this possibility:
I admit that a comptime variable is immutable from the moment it is declared until the end of its usage, but its construction should allow for receiving variables. However, comptimePrint only works with immutable const, so comptime functions cannot utilize metaprogramming.