Why can't the Zig compiler detect usage of undefined values?

Hi,

I had a conversation on the Zig Discord server on why undefined exists.

I understand it’s useful for reserving space for variables that can’t be assigned right away, but I found it strange for such an explicitly anti-footgun language to allow the usage of undefined without ever warning the developer. The reasoning that was given to me for this was that it would require a lot of analysis.

After additional research, I found that the compiler does actually fail to build if you dereference a undefined value, which I found interesting.

Can anyone explain to me why usage of undefined values isn’t checked while checking if an undefined variable is dereferenced is.

Thanks

It is genuinely useful to not have to specify a value, but it is certainly a foot gun, hence why it is a keyword in zig, whereas in some other languages its much more implicit.

The machinery to track, as much as possible, if a value is or might be undefined is big and complicated and not at all easy to make, it will be a long time before zig gets it, or it may never.

Currently, zig only covers the easier cases.

Zig is not anti foot gun in the sense of doing everything possible to forbade foot guns; rather zig hopes to provide the tools to catch them as easily as possible, without being overly restricting. But they can and certainly do exist in the language, as an example search for posts about reading or writing after 0.15, the new interfaces have a pretty big one.

4 Likes

Hi,

runtime safety for undefined values has been planned for a long time but has just not implemented yet (tracking issue: runtime safety for branching on undefined values and other undefined behavior caused by undefined values · Issue #63 · ziglang/zig · GitHub).
In safe build modes, undefined is lowered to a 0xAA/0b10101010 bit pattern which is supposed to be recognizable and large enough to not be a reasonable value in most cases. This catches some usages of undefined (e.g. pointer dereferences) pretty reliably by causing a crash, but is obviously not a full replacement for actual tracking of undefined memory.

What you’ve encountered is dereferencing an undefined value at comptime. Since comptime Zig does track undefined values, it can catch such illegal behavior and emit a compile error. The same thing happens if you e.g. branch on an undefined value at comptime:

comptime {
    const x: *u8 = undefined;
    const y = x.*;
    _ = y;
}
error: cannot dereference undefined value
    const y = x.*;
              ~^~
comptime {
    const x: bool = undefined;
    if (x) {}
}
error: use of undefined value here causes illegal behavior
    if (x) {}
        ^
1 Like

It’s not that comptime zig tracks undefined, rather since comptime zig is interpreted from unprocessed IR[1], it will just get a undefined node while interpreting which is a trivial to become a compile error.


  1. in fact, comptime is just the facet of analysis on that IR that can/must be done at compile time, this is why you can say type checking, lazy analysis, etc are the same comptime ↩︎

One situation where ‘using’ an undefined value is useful is for an ‘initInPlace’ method like here in my emulators:

…where the initInPlace impl looks like this:

pub fn initInPlace(self: *Self, opts: Options) void {
        self.* = .{ ... };
}

I guess it’s even more confusing when coming from JS/TS where undefined is an explicit runtime value like null and where you can do things like if (val === undefined) { ... }.

2 Likes

Well yes it kinda does, because undef exists as a separate value that can take on any type. undefined is not tied to a literal undefined AST node, that’s just the ‘origin’ of an undefined value. undefined can propagate through comptime expressions:

const x: u8 = undefined;
const y = 5 +% x;
@compileLog(y);
Compile Log Output:
@as(u8, undefined)

I’d argue that that’s tracking of undefined values.

That’s obviously not possible for e.g. a runtime u8 because every bit pattern of a u8 is a valid u8, so there’s no special state available to represent undefined and there would have to be some kind of external tracking for whether the value is defined or not.
The comptime Zig implementation stores values as indices into a data structure that holds their concrete values (src/InternPool.zig), so it doesn’t have that restriction and can represent undefined as a separate state without much overhead.