I just notice this in the code for std.unicode.wtf16LeToWtf8:
pub fn wtf16LeToWtf8(wtf8: []u8, wtf16le: []const u16) usize {
return utf16LeToUtf8Impl(wtf8, wtf16le, .can_encode_surrogate_half) catch |err| switch (err) {};
}
I just notice this in the code for std.unicode.wtf16LeToWtf8:
pub fn wtf16LeToWtf8(wtf8: []u8, wtf16le: []const u16) usize {
return utf16LeToUtf8Impl(wtf8, wtf16le, .can_encode_surrogate_half) catch |err| switch (err) {};
}
A switch statement must handle all possible errors the error set contains. The return type of utf16LeToUtf8Impl returns different error sets depending on some comptime parameters. This specific usage returns an empty error set. The switch is essentially verifying that the error set is empty. If an error is added, it would produce a compile error indicating that this piece of code needs to be fixed. It also informs the reader of the code that this is happening, and itâs not just discarding a possible error.
The reason for possibly returning an empty error union is to keep usage consistent across call sites. If there were no cases where it could return an error, then it wouldnât return an error union.
This can also be done with just a try! which is terser, but would give a worse error message if an error were possible ![]()
Would it? Seems it would feel somewhat ânormalâ, though perhaps thatâs the point - perhaps an abnormal error is wanted, to help achieves that goal: âIt also informs the reader of the code that this is happeningâ.
It does seem like the empty switch is almost too clever by half. Is catch unreachable an option, if this is supposed to be impossible?
unreachable on its own is allowed to pass to runtime as an optimisation hint, or safety check. Meaning it would compile regardless of if the path was reachable, meaning it will compile if it can return an error.
comptime unreachable would be better. I agree it communicates the intent better than an empty switch or try in an errorless function does. And its even a better error, since the error would communicate concisely and precisely what the problem is.
An empty switch would seem arcane to an unknowing reader. And its error of âall cases must be handled [list of cases to handle]â is nice in itâs intended context, does not communicate intent very well.
A try on the other hand, would give an error of âexpected type usize found error{...}!usizeâ. Which, contrary to my earlier statement, actually communicates the problem and intent better. When I said it was a worse error I was thinking of those large error sets, sometimes with some @builtins included, that newer users struggle to decipher the first dozen times.
But reading the source try seems like it should not compile, so communicates intent poorly.
Yeah, the code is confusing as it relies on the edge case of switch() {} returning an implicit noreturn. Given that the check here is essentially pedantic, it should be explicitly spelled out:
pub fn wtf16LeToWtf8(wtf8: []u8, wtf16le: []const u16) usize {
return utf16LeToUtf8Impl(wtf8, wtf16le, .can_encode_surrogate_half) catch |err| switch (err) {
inline else => |e| {
@compileError("Expecting an empty error set, received: " ++ @errorName(e));
},
};
}
Or not done at all.
Should the construct in question really be allowed at all? The problem here is that the switch statement is switching on something that conceptionally does no exist. Itâs switching on something that isnât even nothing. Conceptionally, void represents nothing. A switch on a void would still produce a execution path. A choice between a single option is not really a choice at all but itâs still logically sound. Switching on an item in a null set, which cannot exist by definition, is just really, really weird.
I think, the empty switch statement is fine, but it would be even better with a simple comment:
fun() catch |err| switch (err) {
// the error set is empty so there is nothing to handle here
// the compiler will let us know when this is no longer true
// also note that all these lines have the exact same length
};
Regarding the catch comptime unreachable, I think it can also be much better with an actual log:
fun() catch @compileLog("It seems the error set is no longer empty. Handle those errors!");
This way, you get a compile time error which tells you that your assumption is no longer true. And itâs still just one line of code (for those who care).
Edit: Just noticed that both of my ideas are pretty similar to what @chung-leong suggested.
I think so, yes.
It does exist: itâs error{}, which is a weird thing, but very useful: an empty error set. Itâs useful because comptime means that sometimes functions need to return error unions, but those unions wonât always include an error at all. The alternative there is to relax the rules around inferring return values, and I donât think we should.
Error sets are good, switching on error sets is good, and empty error sets are good: so switching on an empty error set is also good.
What else could it look like? else doesnât fit, it implies thereâs something to switch on which was skipped: but there isnât anything to switch on.
So:
catch |empty_err| switch(empty_err) {}
As you and others have pointed out, you can put something in there to make it less gnomic, if youâd like. Also, this wonât compile if empty_err starts being an invalid name, so thereâs another advantage to it.
But itâs a switch on a collection with no values, so it has no prongs. I think this is correct.
It express intent perfectly to the compiler. It expresses intent âjust fineâ to an experienced zigster. It expresses intent a bit less clearly to newbies. Even for one more experienced, I think it takes an extra second to grok. My favorite answer is that @compileError() would be best, with comptime unreachable less perfect only because it would still require a comment if you wanted to clarify. (Recently I replaced two of my own comptime unreachables with @compileError()s; doesnât always suit, but extra verbiage can be nice.)
My interest in this relates to my own history. A few months ago, high on the novelty of clarity and âno magicâ in zig, I would have looked suspiciously at an empty switch⌠âit must work, but⌠I have to think about it.â Now, meh. It doesnât bother me much. It took a second to see what was going on. The comment about how this function is essentially implementing an interface of sorts, and therefore needs the errorset seemed useful, and made me think, âhmnn, I wonder if @compileError(message...) would communicate that in-placeâ, but honestly, with time, stuff like this feels much less significant. Is my sense of âwhat magic isâ⌠changing? Who knows. I honestly donât feel like this is âmagicâ, today, even if it doesnât spell out the contextual detail for a reader. Should it have to? Would it be easy enough to just do so? Itâs a little pebble, though, aside from providing a nice instructive conversation.
Youâre confusing the type with an instance of that type. An instance of error{} is logically impossible. If you run the following:
pub fn main() void {
const err: error{} = undefined;
@compileLog(err);
}
The compiler will just segfault. What we have here is something akin to the square root of -1. Itâs an imaginary value. The switch statement is thus also imaginary. No switching actually occurs at comptime or runtime. Why retain such a weird idea in the language when thereâs no good usage scenario?
When youâre writing code in anticipation of some possible future event (in this case the function suddenly returning a non-empty error set), you need to clearly communicate your intent. Otherwise youâre just going to create confusion for the future programmer, who might be you in a few years time or someone else. That was the state I found myself in looking at the code in question. I didnât know why the code would compile at all, since switch statements are typically of the type void when the cases themselves donât yield any values. The code would only make sense if switch(err) {} returns noreturn. Thatâs totally an edge case that which even be confirmed because the construct is imaginary.
Iâm not confusing anything actually.
const WeirdKind = enum {};
test "does it exist?" {
std.debug.print("{s}\n", .{@typeName(WeirdKind)});
const weird: WeirdKind = undefined;
switch (weird) {} // weird!
std.debug.print("and yet...\n", .{});
}
If I were to write linter rules for Zig, an empty error switch would trigger a warning.
`comptime unreachable` is clearer and not much longer. Code should be readable for beginners, preferably.
To offer a dissenting opinion, my general view is that domain-specific vocabulary is beneficial, and that it doesnât as much create barrier to entry, as it makes lack of knowledge legible. Thereâs no way around spending effort to learn something, and, if that something has a specific name or shape, it becomes easier to anchor the acquired knowledge.
In this specific case, the idiom is that there are uninhabited types, types which do not admit any values (error{}, union(enum){}), that text of a program can manipulate values of those types (which means that the code is unreachable), and that you can pattern match a value of uninhabited type to turn it into any other type, including noreturn.
Thatâs a useful idiom, which looks puzzling at first sight, but then you learn it, and it becomes instantly recognizable.
The twist though, is that the above code doesnât quite do what it seems to do.
A hint is that Zig doesnât actually implement âpattern matching uninhabited type into noreturnâ, it only pattern matches it to void:
const T = union(enum) {};
comptime {
@compileLog(@TypeOf(switch (@as(T, undefined)) {}));
// @as(type, void)
}
(you can also check that the code after switch isnât considered dead).
But then, how can the following work?
const E = error{};
fn f() E!u32 {
return 92;
}
fn g() u32 {
return f() catch |err| switch (err) {};
}
You canât covert E!u32 into u32 if you unwrap E into void!
It gets worse, the following also works:
const E = error{};
fn f() E!u32 {
return 92;
}
fn g() u32 {
return f() catch |err| switch (err) {
error.foo => {},
error.bar => {},
};
}
But the following finally stops compiling:
const E = error{foo};
fn f() E!u32 {
return 92;
}
fn g() u32 {
return f() catch |err| switch (err) {
error.foo => {},
error.bar => {},
};
}
This last example, expectedly, complains that bar isnât a kind of error that possible.
So, while switch(err) { } looks like it is discharging an uninhabited type, thatâs not actually what happens here. Laziness strikes again! catch |err| is the last thing that the compiler tries to analyze, at that point it already sees that err is uninhabited, and skips analyzing the âthen clauseâ of catch at all. So you can write whatever garbage there, as long as it passes AST check, it doesnât matter.
In other words, the most direct way to implement the original example would have been this:
pub fn wtf16LeToWtf8(wtf8: []u8, wtf16le: []const u16) usize {
return utf16LeToUtf8Impl(wtf8, wtf16le, .can_encode_surrogate_half) catch
comptime unreachable;
}
This is indeed very interesting. Zig generally blurs the distinction between uninhabited types and unit types, only retaining noreturn as an uninhabited type.. E.g., in general, a purely ânamespaceâ should be regarded as a nominal uninhabited type, but Zig implements namespaces using unit types and sometimes leverages this feature to provide flexibility.
However, when we need to use the characteristics of uninhabited types, we encounter some unexpected issues.
Well put. Since Zig has comptime, and therefore, types-as-values, thereâs a role for types which canât be instantiated (as well as ones which technically can be even though itâs useless, like struct containers). Those types have âexistenceâ because types exist: we can pass them around, introspect them, condition code on their structure, and so on.
Itâs a unique combination of design principles, and I donât say that lightly. So itâs okay that there are consequences of it which are hard to guess in advance, and confusing to encounter in code. Thereâs a learning curve, thatâs okay.
Iâll note that at the time the issue I just linked to was written (#1350), enum{} was a compile error. This was correctly seen as an inconsistency: we donât need special cases for degenerate types, because they arenât dangerous, and may in some cases be useful.
For example, if thereâs a comptime choice between an enum and no-relevant-values, enum{} might be better than void as that type. If misused, the compile error will identify that the enum doesnât have a field of that literal, rather than say something more confusing about casting an enum literal to void or whatever.
But we donât need to make up a use for all the degenerate / trivial types to justify them: thereâs simply no reason to forbid them to begin with.