Backstory: I needed partial enum match for cli: refactor main.zig by matklad · Pull Request #3152 · tigerbeetle/tigerbeetle · GitHub, got annoyed enough to start writing a bug report about compiler complaining about non-exhaustive switches over comptime-known values, and then it dawned on me…
Not a meaningful difference, but note that you can also use @compileError
for this:
switch (u) {
inline .a, .b, .c => |_, ab| {
handle_ab();
switch (ab) {
.a => handle_a(),
.b => handle_b(),
else => @compileError("must be a or b"),
}
},
}
switch.zig:14:25: error: must be a or b
else => @compileError("must be a or b"),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
this is really cool! I had no idea you could use inline on anything other than else. bytecode VMs are top of mind right now, which is a place where this would be really valuable for me:
dispatch: switch (instruction) {
inline .load0, .load1, .load2, .load3, .load => |_, tag| {
const slot = switch (tag) {
.load => self.read(u8),
else => @intFromEnum(tag) - @intFromEnum(.load0),
};
self.push(self.locals[slot]);
continue :dispatch self.read(Instruction);
},
// ...
}
This reminds me of a thread from the end of last year: Sub switch pattern - #4 by joed
I threw together a type-safe way of doing something very similar to this, without having to spell out every case for the inline
case, and without having to explicitly add a comptime unreachable
; allowing you to write cocde like this:
const Operator = enum {
pub const Arity = enum { nullary, unary, binary };
halt,
neg,
inc,
add,
sub,
pub fn arity(self: Operator) Arity {
return switch (self) {
.halt => .nullary,
.neg, .inc => .unary,
.add, .sub => .binary,
};
}
};
pub fn main() !void {
var op: Operator = .neg;
_ = .{&op};
switch (groupBy(op, Operator.arity)) {
.nullary => |o| {
switch (o) {
.halt => std.debug.print("halt\n", .{}),
}
},
.unary => |o| {
switch (o) {
.neg => std.debug.print("negation\n", .{}),
.inc => std.debug.print("increment\n", .{}),
}
},
.binary => |o| {
switch (o) {
.add => std.debug.print("addition\n", .{}),
.sub => std.debug.print("subtraction", .{}),
}
}
}
}
I have a modified and hacky version at GitHub - joedavis/metax: Miscellaneous metaprogramming facilities for Zig that works with tagged unions as well as enums, but I haven’t tested it in a few months on recent zig versions.
I was going to argue for this instead:
dispatch: switch (instruction) {
.load0, .load1, .load2, .load3 => {
const slot = @intFromEnum(tag) - @intFromEnum(.load0);
self.push(self.locals[slot]);
continue :dispatch self.read(Instruction);
},
.load => {
const slot = self.read(u8);
self.push(self.locals[slot]);
continue :dispatch self.read(Instruction);
},
// ...
}
which, in general, I think would be more preferable. If the self.push(self.locals[slot]);
were nontrivial, you could call a common function between the two prongs.
This avoids generic bloat for the loadX tags. However… given that this is a hot dispatch loop, and the logic is trivial, the inline bloat might actually be helping you out since there will be 5 continue :dispatch
sites rather than 2, potentially improving branch prediction. Furthermore, having a separate prong for each tag could help the optimizer with lowering to a jump table.
You’ll have to measure and report back!
this was exactly my thinking! in any case, it’s neat that the tools are there. like in @joed’s linked post - you can imagine using this for operators, especially when typed like .int32_add
/.int64_add
/.f32_add
, etc
In real-code, I’d recommend not using my solution, and instead @matklad’s from the original post. Looking at the dates, I was more than likely a bit tipsy during some downtime while prepping new years dinner for my partner and in-laws when I came up with this solution.
For posterity, the post I actually was writing while I got destructed with tagged union matching:
First of all – really excellent article, I enjoyed reading it end to end.
This form used to be rather important, as Zig lacked a counting loop. It has
for(0..10) |i|
form now, so I am tempted to call the while-with-increment redundant.Annoyingly,
while (condition) { defer increment; body }
is almost equivalent to
while (condition) : (increment) { body }
But not exactly: if
body
contains areturn
,break
ortry
, thedefer
version would run theincrement
one extra time, which is useless and might be outright buggy. Oh well.
I’ve run into this before. I was wondering if this could be solved by a continuedefer
-like keyword.
That’s pseudocode! I am talking specifically about the actual coercion operation applied by the compiler (eg, widening from u8 to u32). In my mental model, @as
doesn’t do coercion (the user-space certainly doesn’t), it only sets the result type. Coercion is then inserted by compiler when it notices that the actual and expected types are different, but compatible.
- How is a reader supposed to know that?
- Using pseudocode (without mentioning it) in a post talking specifically about the syntax of a language seems like a strange choice
Nice article. I agree Zig’s syntax is lovely. (and also agree that one while with incrementer syntax is a bit bleh.)
While pointer type is prefix, pointer dereference is postfix, which is a more natural subject-verb order to read:
I do think there’s a better way to phrase the reason than “natural subject-verb” ordering as a justification for this: as you read from left to right, the operations work on the type from left to right. I believe this ordering is true of all unconditional reductions of a more complicated type to a simpler type in Zig.
Eg,
if value
is a *const ?[3]u32
, and you do value.*.?[0]
, where .*
removes *const
, .?
removes the ?
, [0]
removes the [3]
, leaving only u32
. This is also true of field accessing of structs and unions.
Indeed, in languages that follow C’s pointer syntax, pointer following is self inconsistent. In Zig a single item pointer, a multi-item pointer, and a slice (all forms of pointers) have their accessor on the right.
Where this pattern doesn’t extend shows a different pattern: Accesses which do invoke control flow tend to wrap the expression: if
and while
for options, for
for arrays and slices, and switch
for enums and numbers.
orelse
, catch
, and
, and or
are kind of together in their third class as binary operators with control flow, where they sit between two expressions.
I like how while syntax lends itself to pointer chasing.
var current: ?*Node = head;
while (current) |c| : (current = c.next) {
// ...
}
As for the value.*
syntax, here’s how I think about it:
(assuming value
is a pointer)
Want to set field foo
of value
? value.foo = ...
Want to set field bar
of value
? value.bar = ...
Want to set all of what value
points to? value.* = .{ ... }
value
is a (pointer to a) number? value.* = 42
So to me the .*
is like saying “all the fields / content” :^)
Nice post!
if
body
contains areturn
,break
ortry
, thedefer
version would run theincrement
one extra time, which is useless and might be outright buggy.
fwiw I find it useful for custom iterators:
pub const FilteredIterator = struct {
items: []Actor,
// iterator state
index: usize = 0,
// filters
flags: Actor.FlagSet = .initEmpty(),
pub fn next(it: *FilteredIterator) ?Actor.Handle {
while (it.index < items.len) {
defer it.index += 1;
if (it.filterAccept()) |handle| return handle;
}
return null;
}
/// return current handle iff it passes the filters
fn filterAccept(it: *FilteredIterator) ?Actor.Handle { ... }
}
Before I used defer I kept forgetting this crucial line, causing an infinite loop because the iterator would return the same actor over and over again:
while (it.index < items.len) : (it.index += 1) {
if (it.filterAccept()) |handle| {
it.index += 1; // don't forget this line
return handle;
}
}
Thanks, that’s another great reason for why this syntax form is bad =]
Related reading : Tagged Union Subsets with Comptime in Zig – Mitchell Hashimoto