What is the purpose of an empty switch statement?

I just notice this in the code for std.unicode.wtf16LeToWtf8:

pub fn wtf16LeToWtf8(wtf8: []u8, wtf16le: []const u16) usize {
    return utf16LeToUtf8Impl(wtf8, wtf16le, .can_encode_surrogate_half) catch |err| switch (err) {};
}
5 Likes

A switch statement must handle all possible errors the error set contains. The return type of utf16LeToUtf8Impl returns different error sets depending on some comptime parameters. This specific usage returns an empty error set. The switch is essentially verifying that the error set is empty. If an error is added, it would produce a compile error indicating that this piece of code needs to be fixed. It also informs the reader of the code that this is happening, and it’s not just discarding a possible error.

The reason for possibly returning an empty error union is to keep usage consistent across call sites. If there were no cases where it could return an error, then it wouldn’t return an error union.

16 Likes

This can also be done with just a try! which is terser, but would give a worse error message if an error were possible :frowning:

2 Likes

Would it? Seems it would feel somewhat “normal”, though perhaps that’s the point - perhaps an abnormal error is wanted, to help achieves that goal: “It also informs the reader of the code that this is happening”.

It does seem like the empty switch is almost too clever by half. Is catch unreachable an option, if this is supposed to be impossible?

2 Likes

unreachable on its own is allowed to pass to runtime as an optimisation hint, or safety check. Meaning it would compile regardless of if the path was reachable, meaning it will compile if it can return an error.

comptime unreachable would be better. I agree it communicates the intent better than an empty switch or try in an errorless function does. And its even a better error, since the error would communicate concisely and precisely what the problem is.

An empty switch would seem arcane to an unknowing reader. And its error of “all cases must be handled [list of cases to handle]” is nice in it’s intended context, does not communicate intent very well.

A try on the other hand, would give an error of “expected type usize found error{...}!usize”. Which, contrary to my earlier statement, actually communicates the problem and intent better. When I said it was a worse error I was thinking of those large error sets, sometimes with some @builtins included, that newer users struggle to decipher the first dozen times.
But reading the source try seems like it should not compile, so communicates intent poorly.

3 Likes

Yeah, the code is confusing as it relies on the edge case of switch() {} returning an implicit noreturn. Given that the check here is essentially pedantic, it should be explicitly spelled out:

pub fn wtf16LeToWtf8(wtf8: []u8, wtf16le: []const u16) usize {
    return utf16LeToUtf8Impl(wtf8, wtf16le, .can_encode_surrogate_half) catch |err| switch (err) {
        inline else => |e| {
            @compileError("Expecting an empty error set, received: " ++ @errorName(e));
        },
    };
}

Or not done at all.

9 Likes

Should the construct in question really be allowed at all? The problem here is that the switch statement is switching on something that conceptionally does no exist. It’s switching on something that isn’t even nothing. Conceptionally, void represents nothing. A switch on a void would still produce a execution path. A choice between a single option is not really a choice at all but it’s still logically sound. Switching on an item in a null set, which cannot exist by definition, is just really, really weird.

1 Like

I think, the empty switch statement is fine, but it would be even better with a simple comment:

fun() catch |err| switch (err) {
    // the error set is empty so there is nothing to handle here
    // the compiler will let us know when this is no longer true
    // also note that all these lines have the exact same length
};

Regarding the catch comptime unreachable, I think it can also be much better with an actual log:

fun() catch @compileLog("It seems the error set is no longer empty. Handle those errors!");

This way, you get a compile time error which tells you that your assumption is no longer true. And it’s still just one line of code (for those who care).

Edit: Just noticed that both of my ideas are pretty similar to what @chung-leong suggested.

2 Likes

I think so, yes.

It does exist: it’s error{}, which is a weird thing, but very useful: an empty error set. It’s useful because comptime means that sometimes functions need to return error unions, but those unions won’t always include an error at all. The alternative there is to relax the rules around inferring return values, and I don’t think we should.

Error sets are good, switching on error sets is good, and empty error sets are good: so switching on an empty error set is also good.

What else could it look like? else doesn’t fit, it implies there’s something to switch on which was skipped: but there isn’t anything to switch on.

So:

catch |empty_err| switch(empty_err) {}

As you and others have pointed out, you can put something in there to make it less gnomic, if you’d like. Also, this won’t compile if empty_err starts being an invalid name, so there’s another advantage to it.

But it’s a switch on a collection with no values, so it has no prongs. I think this is correct.

3 Likes

It express intent perfectly to the compiler. It expresses intent “just fine” to an experienced zigster. It expresses intent a bit less clearly to newbies. Even for one more experienced, I think it takes an extra second to grok. My favorite answer is that @compileError() would be best, with comptime unreachable less perfect only because it would still require a comment if you wanted to clarify. (Recently I replaced two of my own comptime unreachables with @compileError()s; doesn’t always suit, but extra verbiage can be nice.)

My interest in this relates to my own history. A few months ago, high on the novelty of clarity and “no magic” in zig, I would have looked suspiciously at an empty switch… “it must work, but… I have to think about it.” Now, meh. It doesn’t bother me much. It took a second to see what was going on. The comment about how this function is essentially implementing an interface of sorts, and therefore needs the errorset seemed useful, and made me think, “hmnn, I wonder if @compileError(message...) would communicate that in-place”, but honestly, with time, stuff like this feels much less significant. Is my sense of “what magic is”… changing? Who knows. I honestly don’t feel like this is “magic”, today, even if it doesn’t spell out the contextual detail for a reader. Should it have to? Would it be easy enough to just do so? It’s a little pebble, though, aside from providing a nice instructive conversation.

1 Like

You’re confusing the type with an instance of that type. An instance of error{} is logically impossible. If you run the following:

pub fn main() void {
    const err: error{} = undefined;
    @compileLog(err);
}

The compiler will just segfault. What we have here is something akin to the square root of -1. It’s an imaginary value. The switch statement is thus also imaginary. No switching actually occurs at comptime or runtime. Why retain such a weird idea in the language when there’s no good usage scenario?

When you’re writing code in anticipation of some possible future event (in this case the function suddenly returning a non-empty error set), you need to clearly communicate your intent. Otherwise you’re just going to create confusion for the future programmer, who might be you in a few years time or someone else. That was the state I found myself in looking at the code in question. I didn’t know why the code would compile at all, since switch statements are typically of the type void when the cases themselves don’t yield any values. The code would only make sense if switch(err) {} returns noreturn. That’s totally an edge case that which even be confirmed because the construct is imaginary.

I’m not confusing anything actually.

const WeirdKind = enum {};

test "does it exist?" {
    std.debug.print("{s}\n", .{@typeName(WeirdKind)});
    const weird: WeirdKind = undefined;
    switch (weird) {} // weird!
    std.debug.print("and yet...\n", .{});
}

See also: #1530.

If I were to write linter rules for Zig, an empty error switch would trigger a warning.

`comptime unreachable` is clearer and not much longer. Code should be readable for beginners, preferably.

3 Likes

To offer a dissenting opinion, my general view is that domain-specific vocabulary is beneficial, and that it doesn’t as much create barrier to entry, as it makes lack of knowledge legible. There’s no way around spending effort to learn something, and, if that something has a specific name or shape, it becomes easier to anchor the acquired knowledge.

In this specific case, the idiom is that there are uninhabited types, types which do not admit any values (error{}, union(enum){}), that text of a program can manipulate values of those types (which means that the code is unreachable), and that you can pattern match a value of uninhabited type to turn it into any other type, including noreturn.

That’s a useful idiom, which looks puzzling at first sight, but then you learn it, and it becomes instantly recognizable.

The twist though, is that the above code doesn’t quite do what it seems to do.

A hint is that Zig doesn’t actually implement “pattern matching uninhabited type into noreturn”, it only pattern matches it to void:

const T = union(enum) {};
comptime {
    @compileLog(@TypeOf(switch (@as(T, undefined)) {}));
    // @as(type, void)
}

(you can also check that the code after switch isn’t considered dead).

But then, how can the following work?

const E = error{};

fn f() E!u32 {
    return 92;
}

fn g() u32 {
    return f() catch |err| switch (err) {};
}

You can’t covert E!u32 into u32 if you unwrap E into void!

It gets worse, the following also works:

const E = error{};

fn f() E!u32 {
    return 92;
}

fn g() u32 {
    return f() catch |err| switch (err) {
        error.foo => {},
        error.bar => {},
    };
}

But the following finally stops compiling:

const E = error{foo};

fn f() E!u32 {
    return 92;
}

fn g() u32 {
    return f() catch |err| switch (err) {
        error.foo => {},
        error.bar => {},
    };
}

This last example, expectedly, complains that bar isn’t a kind of error that possible.

So, while switch(err) { } looks like it is discharging an uninhabited type, that’s not actually what happens here. Laziness strikes again! catch |err| is the last thing that the compiler tries to analyze, at that point it already sees that err is uninhabited, and skips analyzing the “then clause” of catch at all. So you can write whatever garbage there, as long as it passes AST check, it doesn’t matter.

In other words, the most direct way to implement the original example would have been this:

pub fn wtf16LeToWtf8(wtf8: []u8, wtf16le: []const u16) usize {
    return utf16LeToUtf8Impl(wtf8, wtf16le, .can_encode_surrogate_half) catch 
        comptime unreachable;
}
11 Likes

This is indeed very interesting. Zig generally blurs the distinction between uninhabited types and unit types, only retaining noreturn as an uninhabited type.. E.g., in general, a purely ‘namespace’ should be regarded as a nominal uninhabited type, but Zig implements namespaces using unit types and sometimes leverages this feature to provide flexibility.

However, when we need to use the characteristics of uninhabited types, we encounter some unexpected issues.

1 Like

Well put. Since Zig has comptime, and therefore, types-as-values, there’s a role for types which can’t be instantiated (as well as ones which technically can be even though it’s useless, like struct containers). Those types have “existence” because types exist: we can pass them around, introspect them, condition code on their structure, and so on.

It’s a unique combination of design principles, and I don’t say that lightly. So it’s okay that there are consequences of it which are hard to guess in advance, and confusing to encounter in code. There’s a learning curve, that’s okay.

I’ll note that at the time the issue I just linked to was written (#1350), enum{} was a compile error. This was correctly seen as an inconsistency: we don’t need special cases for degenerate types, because they aren’t dangerous, and may in some cases be useful.

For example, if there’s a comptime choice between an enum and no-relevant-values, enum{} might be better than void as that type. If misused, the compile error will identify that the enum doesn’t have a field of that literal, rather than say something more confusing about casting an enum literal to void or whatever.

But we don’t need to make up a use for all the degenerate / trivial types to justify them: there’s simply no reason to forbid them to begin with.

2 Likes