Unreachable

tensorush · March 22, 2024, 9:23am

unreachable is an assertion that the programmer makes to ensure program correctness and enable compiler optimizations. It is but one among a plethora of other assertions, like casting, indexing and aligning operations, all implemented as safety-checked illegal behavior.

For simplification, unreachable’s semantics could be decomposed into several different behaviors depending on evaluation time as well as optimization mode:

Hitting unreachable at runtime emits a panic when compiled in Debug and ReleaseSafe modes, acting exactly the same as @panic("reached unreachable code"), but causes unchecked illegal behavior in ReleaseFast or ReleaseSmall modes.
Hitting unreachable at compile-time, e.g. in comptime unreachable, causes a compile-time error, acting exactly the same as @compileError("reached unreachable code").

However, the crucial semantic difference between unreachable and @panic is that unreachable means that you guarantee a given assertion will never fail, while @panic means that you accept that a given assertion could fail, in which case the program, having detected that it has entered an incorrect, unrecoverable state, will crash.

As a result, since using unreachable at runtime anywhere in a codebase that’s intended to be compiled in unsafe modes may result in unchecked illegal behavior, you should consider using @panic wherever possible.

Optional unwrap

Don’t forget that a.? is just a shorthand for:

a orelse unreachable

Debug assertion

Don’t forget that std.debug.assert is implemented as:

pub fn assert(ok: bool) void {
    if (!ok) unreachable;
}

Error discarding misuse

Should not be used to ignore errors because you would be making a guarantee that a function call will never return an error, which would contradict the function’s design of being allowed to return an error in the first place.

export fn func() void {
    mayFail() catch unreachable;
}

Impossible `switch` case handling

Can be used to guarantee that certain, or remaining, switch cases will never happen.

switch (my_union) {
    .a => |a| { ... },
    else => unreachable,
}

Static error absence guarantee

Can be used with errdefer as a compile-time check that enforces the absence of errors in the remaining lines of the current block.

// Errors are allowed here
errdefer comptime unreachable;
// Errors are forbidden here

Explicit control flow barrier

Can be used to satisfy the compiler requirement of guaranteeing that the control flow will never reach the end of the current block.

fn withFor(any: AnySlice) usize {
    const Tag = @typeInfo(AnySlice).Union.tag_type.?;
    inline for (@typeInfo(Tag).Enum.fields) |field| {
        if (field.value == @intFromEnum(any)) {
            return @field(any, field.name).len;
        }
    }

    // When using `inline for` the compiler doesn't know that every
    // possible case has been handled requiring an explicit `unreachable`.
    unreachable;
}

AndrewCodeDev · March 22, 2024, 9:42am

Overall, I think this is good - there’s one thing we should add to this example:

tensorush:

pub fn func(min: u8, max: u8) u8 {
    // same as `if (min > max) unreachable;`
    std.debug.assert(min <= max);

    var byte: u8 = 0;
    while (byte <= max) : (byte += 1) {
        if (byte >= min) {
            return byte;
        }
    }

    // explicit `unreachable` is required to avoid the error:
    // "function with non-void return type 'u8' implicitly returns"
    unreachable;
}

Since assert is just an unreachable as well, this function can end in undefined behavior depending on the optimization settings.

pub fn assert(ok: bool) void {
    if (!ok) unreachable; // assertion failure
}

I think it’s worth mentioning that - I can edit that in.

tensorush · March 22, 2024, 9:44am

My thoughts exactly, so I’ve already added it

AndrewCodeDev · March 22, 2024, 9:48am

Right, I can see that from the block about assert above, but for beginners, I think it’s worth spelling it out a bit more in that example. Just a note like:

“In debug, these checks will validate the behavior of this function, but it recall that optimization impacts unreachable statements and can cause undefined behavior. In general, do not depend on this pattern.” kind of sentiment.

tensorush · March 22, 2024, 9:48am

Btw, I’ve discovered that langref does mention that unreachable in ReleaseFast and ReleaseSmall results in UB, but it’s deep in the try section:

https://ziglang.org/documentation/master/#try

tauoverpi · March 22, 2024, 9:48am

Can be used to indicate that certain, or remaining, switch cases cannot be handled.

This is the wrong use of unreachable as it’s not to say that the branches cannot be handled but rather that the path will never be taken and it’s safe to completely exclude the path from generated code. Thus the code introduces undefined behaviour if used as described as unreachable with switch should be interpreted as “I promise this can never happen and I’ve proven it externally”.

Can be used to discard errors, e.g. in functions that cannot return errors by design, like ensuring C ABI compatibility.

It should only be used in cases where mayFail() is guaranteed not to retun an error otherwise the result of running func is undefined behaviour. @panic() is likely more suitable if mayFail() can fail and there is no way to handle the error API wise.

tensorush · March 22, 2024, 9:53am

Yes, definitely. Maybe my wording is off, but I meant exactly what you’ve clarified.

Also, I’ve tried to tackle the debates of when unreachable should be used by prefacing the doc with the description of what exactly unreachable stands for in each context, letting everyone decide for themselves whether they should adopt a certain pattern or not.

AndrewCodeDev · March 22, 2024, 9:55am

If you think the the explanation is clear enough, then I’m good with it. I try to read things with a beginner’s eye but the fact that we’re talking about it here is probably good enough coverage

tensorush · March 22, 2024, 10:08am

This caveat belongs in the “Reaching Unreachable Code” section:

https://ziglang.org/documentation/master/#Reaching-Unreachable-Code

UPD: My bad it mentions it in the paragraph prefacing the “Undefined Behavior” section. Time to reread the langref, lots of new stuff.

tauoverpi · March 22, 2024, 10:19am

I still find that the “Error discarding” should be removed as it’s incorrect and replaced with a section explaining that catch unreachable is used to communicate mayFail() cannot (as in it’s impossible) fail in the given situation to the compiler which you’ve proven externally (or you accept it could start playing tetris if you made a mistake). Mentioning C ABI compatability is also directing beginners down the wrong path as they should decide how to handle the error and @panic() if there’s no better option and mayFail() can fail in that call. Failing fast is often the better choice and situations where you must continue even with the wrong state are rare.

Same for the switch as it’s better not to include wrong advice in the initial post given that beginners will likely not read all of the discussion.

tensorush · March 22, 2024, 10:41am

Using unreachable at runtime anywhere in a codebase that’s intended to be compiled in ReleaseFast or ReleaseSmall may result in UB, in which case you’re better off resorting to unreachable’s runtime counterpart @panic. I thought the doc introduction made it clear enough.

tauoverpi · March 22, 2024, 10:53am

“Error discarding” is wrong use by definiton. ReleaseSafe doesn’t change what unreachable communicates just what happens upon violation of the assertion thus it’s still undefined behaviour just with the compiler consistently calling @panic(). unreachable use should not be inconsistent between release modes as this creates brittle code and divides zig into two languages where one uses unreachable as @panic() and the other doesn’t.

Thus the “Error discarding” section should be removed as it shows wrong usage of undefined and presents it as if it were the correct way to not handle an error. There is no way to discard an error in zig, just different ways to handle them where the user is expected to decide which is the most suitable.

AndrewCodeDev · March 22, 2024, 11:22am

I think we need to split the difference here.

It’s entirely possible that while reading someone’s code, you’ll come across this. You may not agree with it (I don’t recommend this pattern either), but that doesn’t mean you won’t encounter it at some point. It may be completely wrong and communicate the wrong thing, but it’s not invalid to document that these things exist and do compile.

At the same time, we can also document why things can go wrong and provide other alternatives. You’re right in saying that there are other ways to handle this and we can certainly add that information, specifically about @panic and that if you stumble into that branch, you’ve wandered into undefined behaviour. We can also add the bit that this pattern assumes that the author has already ruled out (to a certainty) that this branch will ever be taken.

I don’t think it should be removed, but we can adjust the explanation surrounding what it technically means and add some alternatives.

tauoverpi · March 22, 2024, 11:39am

I did mention that it should be replaced initially with a section explaining that it’s an easy mistake to make as it’s not correct by any definition as the language is defined today. It’s the suggestion that it’s a valid way to handle this case that’s the most problematic. Just because something does compile doesn’t mean it’s correct thus if this is to remain it should be moved to a “mistakes” section and explictly documented as such.

If it’s not replaced with a mistakes section then the second best is still removal of wrong advice. I don’t see why it’s valuable to keep wrong use of the language in what is effectively a “suggested use” section.

note:

https://ziglang.org/documentation/master/#unreachable
https://ziglang.org/documentation/master/#Undefined-Behavior
https://blog.regehr.org/archives/213

kristoff · March 22, 2024, 12:08pm

unreachable is an assertion that the programmer makes to give more information to the compiler.

in unsafe release modes, unreachable (and assertions more in general) become logical propositions that the compiler can leverage to perform better optimizations. when those assertions are wrong, then you get undefined behavior because you gave a “false fact” to the compiler and the resulting behavior breaks the model generally used to describe computation.

in debug / safe modes the assertions are instead tested at runtime to help catch bugs, making unreachable behave similarly to a panic, although those are two very different things, almost opposite to one another.

unreachable means that you are confident a given condition will never manifest, while a panic means that you expect that a given condition could manifest, but that the program has no better way of dealing with it other than crashing.

unreachable when evaluated at comptime behaves like a panic always simply because comptime is always run in a safe evaluation context.

dude_the_builder · March 22, 2024, 12:22pm

IMO this is clearly addressed in the first bullet point of the intro.

AndrewCodeDev · March 22, 2024, 12:25pm

Yeah, and I tend to agree now that I’ve looked over it a few more times. I tend to err on the side of caution with explanations and repeat things, but I agree that’s been covered.

kristoff · March 22, 2024, 12:50pm

I think the first bullet point (nor the ones after) doesn’t mention any of the performance upsides, which is why unreachable is a thing in the first place. Without that consideration, unreachable comes off as a footgunny panic alternative.

I also think that the “exactly the same as panic/compileError” without any further explanation is potentially misleading because it reinforces the idea that the tools are similar while in my opinion is much more useful to highlight the differences.

dude_the_builder · March 22, 2024, 12:55pm

First of all, great contribution @tensorush ; I think we can all agree that much morel learning material regarding unreachable is sorely needed.

Although @tauoverpi 's observations on the correct use of unreachable are good advice in general, they give me the impression of a highly restricted use of this part of the language, pretty much removing its usefulness in the develop → test → debug phase.

I fund throwing in unreachable during fast prototyping an invaluable tool. It lets me concentrate on the main logic to get things working and leave the error-handling analysis for later. Then, before building for production, all I need to do is a quick find or grep for unreachable and then decide how to handle the errors.

Also, even for production, it seems that unreachable in a switch isn’t categorically wrong:

❯ cd zig
❯ fgrep -R 'else => unreachable' lib/std/* | wc -l
     159

Also

You could say that

mayFail() catch return;

is still handling the error, but to me, this is just discarding it. Even the fact that you have catch without a capture as valid language syntax hints in the direction that sometimes discarding the error or ignoring it is the right thing to do.

All in all, I think ti’s just a matter of not speaking in absolutes; being flexible and tolerant, just not to the point of sloppiness.

kristoff · March 22, 2024, 1:10pm

It definitely isn’t wrong at all, as long as it marks an impossible code path.

github.com

kristoff-it/ziggy/blob/main/src/ziggy/Parser.zig#L301-L308


      
          pub fn parseBool(self: *Parser, true_or_false: Token) !bool {
              try self.mustAny(true_or_false, &.{ .true, .false });
              return switch (true_or_false.tag) {
                  .true => true,
                  .false => false,
                  else => unreachable,
              };
          }

In this example the Token union has a lot of cases in it, but because I’m using mustAny, which returns an error if the given token is anything other than .true or .false, then I know that the subsequent switch will only have to handle those two cases and any other is impossible.

Using unreachable as a placeholder is something that I’ve done myself in the past but it’s a bit of a dangerous thing to do, because leaving a wrong unreachable in the code is much worse than leaving a wrong @panic as you might not get any immediately visible misbehavior when you hit it in release mode.