Inconsistency when undefined is assigned to optional

The following code behaves differently depending on the whether runtime safety is active:

const std = @import("std");

pub fn main() !void {
    var a: ?u32 = @as(?u32, undefined);
    const b = &a.?;
    b.* = 0x1234;
    std.debug.print("{?x}\n", .{a});
}

When optimize is Debug or ReleaseSafe, the output is 1234. When it’s ReleaseSmall or ReleaseFast, we get null instead. Should this be reported?

1 Like

You are relying on unchecked illegal behavior, so this code isn’t valid and the bugs are your own.

Consider this modification to your program:

const std = @import("std");

pub fn main() !void {
    var a: ?u32 = @as(?u32, undefined);
    std.debug.print("{X}\n", .{std.mem.asBytes(&a)});
    const b = &a.?;
    b.* = 0x1234;
    std.debug.print("{X}\n", .{std.mem.asBytes(&a)});
    std.debug.print("{?x}\n", .{a});
}

This is the output with Debug.

AAAAAAAAAAAAAAAA
34120000AAAAAAAA
1234

and here it is with release fast:

0000000000000000
3412000000000000
null

So what’s going on here? The definition of “undefined” is that it can take on any value. Almost always, this means that you should never read an undefined value, only later write to it before any reads take place. In safe modes, Zig will write hex ‘AA’ over any memory set to undefined, as a marker for the programmer doing the debugging that if you’re looking at that, it’s /probably/ undefined memory. In release mode, this overwriting doesn’t happen.

So in your code, you’re setting the whole optional to undefined. In memory, the u32 comes first, and the flag for whether the optional is set to a value comes second. The actual illegal code is const b = &a.?;. In Debug, it will check that the optional is set, which reading the flag, it happens to see a non-zero value, and thinks it is. In ReleaseFast, it will doesn’t safety check the presence flag, which happens to default to zero’d memory, but wouldn’t always.

I suspect that what you intended to do was:

const std = @import("std");

pub fn main() !void {
    var a: ?u32 = @as(u32, undefined);
    std.debug.print("{X}\n", .{std.mem.asBytes(&a)});
    const b = &a.?;
    b.* = 0x1234;
    std.debug.print("{X}\n", .{std.mem.asBytes(&a)});
    std.debug.print("{?x}\n", .{a});
}

Which in Debug prints:

0000000001000000
3412000001000000
1234

and in ReleaseFast prints:

0000000001000000
3412000001000000
1234
5 Likes

No, because it works according to the documentation.
Using a value set to undefined is a bug. You can use undefined to defer value initialization or to not use a variable at all.

undefined means the value could be anything, even something that is nonsense according to the type. Translated into English, undefined means “Not a meaningful value. Using this value would be a bug. The value will be unused, or overwritten before being used.”
Documentation - The Zig Programming Language

5 Likes

All that said, I /think/ this could be checked illegal behavior, instead of being unchecked. I’ve asked the compiler team for their opinion. I’m also surprised var a: ?u32 = @as(u32, undefined); doesn’t result in the memory being AAAAAAAA01000000 in Debug, which I’ve also asked about.

There’s an issue tracking “check 0 and 1, not 0 and not-0, in safe modes”, came out of a similar conversation here on the forum.

If the behavior is changed to do the check more specifically, we wouldn’t want that IMHO. undefined is undefined, code shouldn’t look at it, and a nullness check is looking.

I get the perspective that using @as(u32, undefined) is kind of like setting one field but not another, I don’t think it’s a good approach to optionals because of how they participate in control flow. It’s like saying “well technically, this could be either type, who knows, but actually it’s definitely a u32 I just don’t know which one”. Whatever bug is involved in creating that definition, setting the tag to “Some” won’t help fix it.

I have never had a situation where I needed an initial value for a ? which couldn’t be null. ?T says “maybe there’s something here, maybe there isn’t”, well, if there isn’t something on the RHS, then there isn’t a something there, so null it is.

2 Likes

Consistency across different optimization settings is reasonable expectation if you ask me. The problem with something like this is that people will go out of their way to use it once discovered.

Consistency in what is explicitly noted as illegal behaviour?

3 Likes

There is a section in the language reference for that. Though most things are pretty obvious like index out-of-bounds, integer overflow/underflow or division by zero.

1 Like

Your code isn’t (just) changing optimization settings, it’s changing build modes. These conflate optimization and safety. It is not a reasonable expectation that flipping the switch on runtime safety will not cause differing behaviors. How would that even work?

Rather, what’s going on is that the safety mode is not correct for this case. It shouldn’t treat 0xaaaa as equivalent to 0x0001, it should crash, and ideally tell the user why.

I wouldn’t consider this a high priority if it were up to me (which it isn’t), because it’s in the category of “things you shouldn’t be doing anyway”, namely, setting an optional to undefined instead of null.

But if it bothers you enough that you want to fix it, I’m sure that would be welcome.

2 Likes

There might not be an urgent need to address this issue, but it’s very serious. I had kept the example above simple for demonstration purpose. Consider a more realistic scenario: A programmer is writing a function that records the occurrence of a critical event. Initially, he’s written something like this:

const std = @import("std");

const CriticalEvent = struct {
    parameter1: u32,
    parameter2: u32,
    parameter3: u32,
    parameter4: u32,
};
var last_critical_event: ?CriticalEvent = null;

pub fn logCriticalEvent() void {
    const p = &last_critical_event.?;
    p.parameter1 = 0xDEADC0DE;
    p.parameter2 = 0xBAADF00D;
    p.parameter3 = 0xFEEDFACE;
    p.parameter4 = 0xCAFED00D;
}

pub fn main() !void {
    logCriticalEvent();
    std.debug.print("{?}\n", .{last_critical_event});
}

That triggers an attempt-to-use-null panic. At this point, our programmer thinks “okay, I need to flip the optional state from null to set; let me try this…”. Assigning a typed unknown seemingly fixed the problem. So he moves on to working on something else, oblivious to the fact he’s introduced a hidden bug that only surfaces in the build production.

Silent loss of data–that’s arguably the worst sort of bugs a program can have. We can’t let programmers shoot themselves in the head like that.

1 Like

At this point, our programmer thinks “okay, I need to flip the optional state from null to set; let me try this…”. Assigning a typed unknown seemingly fixed the problem.

This seems like a contrived example; I don’t think it counts as a footgun when you’ve deliberately aimed down. By using ? the programmer is asserting that the value isn’t null.

  1. An undefined value should not be accessed
  2. You’re getting an attempt to use null panic, indicating that a value is being accessed
  3. Why would the programmer therefore jump to setting it to undefined as the first course of action?
4 Likes

undefined should not pass the null check in safe modes, that much I agree with. It’s going to impose some kind of penalty, no avoiding that, one way or the other ? and equivalents have to check three states, not two. But official Doctrine is “catch all illegal behavior in safe modes”, which this is, and runtime safety can be set on a per-block basis whenever it’s measurably useful to do that.

I think that’s the solution, not making it possible to set the “Some tag” in a way which is consistent between release modes, but leaves the payload type undefined. I continue to think that’s always a bug in the data model, so the best thing to do is surface that bug as soon as possible, which is when user code looks at a ?T which is undefined.

I read it as an illustration of a bug which status quo behavior aggravates, not a good thing which code should do.

1 Like

You’re saying that build modes that do generate runtime checks and build modes that don’t generate runtimes checks should both generate runtime checks. That’s obviously not reasonable.

3 Likes