Ok so I’ve been cooking. I was able to figure most of this out using tagged unions and inline switch prongs (thanks @andrewrk). It definitely wasn’t immediately intuitive from the docs but eventually I pieced it together… mostly.
I’m gonna talk about what I did, the issues I still have, and my journey going through current documentation.
tl;dr: I didn’t know about the existence of tagged unions and how they can resolve runtime enums into comptime types with the help of inline switch prongs. I also still don’t know how to increment the value inside of a tagged union.
My discovered solution: Tagged Unions + inline switch prongs
I decided to create the following tagged union:
// Following Official docs here
// https://ziglang.org/documentation/0.12.0/#Tagged-union
// (test_tagged_union.zig)
const CellSize = enum { c1, c2, c4, c8 };
const Cell = union(CellSize) {
c1: u8,
c2: u16,
c4: u32,
c8: u64,
// Custom helper functions
fn get_type(comptime tag: CellSize) type {
// Based on looking through LSP options (zls 0.12.0)
return std.meta.TagPayload(Cell, tag);
}
// Based on official docs for Inline Switch Prongs
// https://ziglang.org/documentation/0.12.0/#Inline-Switch-Prongs
// (test_inline_else.zig)
fn get_size(tag: CellSize) usize {
return switch (tag) {
inline else => |size| @sizeOf(std.meta.TagPayload(Cell, size)),
};
}
};
This allowed me to refactor the load_int
function to the following without compile time errors:
fn load_int(self: *Self, cell_size: CellSize) Cell {
const size = Cell.get_size(cell_size);
var bytes: [size]u8 = undefined;
for (0..size) |idx| {
var offset = self.ptr + idx;
if (offset >= self.memory.len) {
offset -= self.memory.len;
}
bytes[idx] = self.memory[offset];
}
switch (cell_size) {
inline else => |cell| return std.mem.readInt(Cell.get_type(cell), @constCast(bytes), .little),
}
}
I also refactored the interpret
function to the following:
pub fn interpret(self: *Self, code: []const u8) {
var cell_size = CellSize.c1;
var code_ptr: usize = 0;
var mod_on = false;
// ...
while (code_ptr < code.len) : (code_ptr += 1) {
// "Command" is an enum
const cur_command = Command.from_char(code[code_ptr]);
//...
switch (cur_command) {
//...
.Mod2 => cell_size = CellSize.c2, // Handle modifiers
.Mod4 => cell_size = CellSize.c4,
.Mod8 => cell_size = CellSize.c8,
.Increment => self.increment(cell_size),
//...
}
if (cell_size != CellSize.c2) {
mod_on = !mod_on;
if (!mod_on) {
cell_size = CellSize.c2;
}
}
}
}
My remaining skill issue: Incrementing the value in a typed union
I am struggling a bit with a casting issue in the increment function:
fn increment(self: *Self, cell_size: CellSize) void {
var value: Cell = self.load_int(cell_size);
switch (value) {
// acc to docs, this doesn't need to be inline... right?
// https://ziglang.org/documentation/0.12.0/#Tagged-union
// (test_switch_modify_tagged_union.zig)
// NOTE: Compile error: incompatible types *root.Cell and comptime_int
else => |*val| val.* = @addWithOverflow(val, 1)[0],
}
self.store_int(value);
}
This function causes the following compile error:
src/root.zig:140:36: error: incompatible types: '*root.Cell' and 'comptime_int'
else => |*val| val.* = @addWithOverflow(val, 1)[0],
^~~~~~~~~~~~~~~~~~~~~~~~
src/root.zig:140:53: note: type '*root.Cell' here
else => |*val| val.* = @addWithOverflow(val, 1)[0],
^~~
src/root.zig:140:58: note: type 'comptime_int' here
else => |*val| val.* = @addWithOverflow(val, 1)[0],
^
I’ve tried a few different ways to rectifiy this type error such as @addWithOverflow(val, @as(@TypeOf(val)), @intCast(1))
to no avail. I just don’t know the proper way to do this when working with a tagged union. Probably something simple.
My journey through the docs
Since you’re interested in creating more documentation, I figured I would share my process of coming upon this solution. I’ll go in chronological order starting after I tried changing every comptime_int
to a type
.
I still have all my tabs open so the research trail will be fairly accurate. I’ll also try to recall all the refactoring I did. However, I can’t guarantee perfect accuracy there because I haven’t been commiting each change.
1. comptime var int_type: type = u8;
Off of your suggestion, I decided to try and figure out if I could simply drop in some comptime
’s somewhere and make it work. I thought that maybe the compiler could resolve all the possible values through what is assigned to int_type
if I simply marked it and each switch prong that modified the value as comptime
. However, this resulted in an error stating something like int_type
depends on runtime control flow.
2. const int_types = .{u8, u16, u32, u64};
After this, I had an idea. What if I defined a constant slice containing all of the valid types and simply index into that array at runtime?
I defined values similar to the following:
const int_types = .{u8, u16, u32, u64};
var cur_int_type: usize = 0;
Then, I passed them to my unchanged functions using the following syntax:
switch (cur_command) {
.Mod2 => cur_int_type = 1,
.Mod4 => cur_int_type = 2,
.Mod8 => cur_int_type = 3,
.Increment => self.increment(int_types[cur_int_type]),
//...
}
However, this did not work. Here, I got an error stating that cur_int_type
was not resolvable at compile time. I tried many different permuations of this solution, including making cur_int_type
comptime
and throwing in comptime
everytime it was modified, getting various different compile errors.
Eventually, I realized that I was actually defining a tuple rather than a slice of types. I attempted to add []type
to the definition but the compiler did not like that either.
3. inline
It was around this time that I noticed a new reply from @andrewrk to this post. I had never seen this keyword before so I decided to research it.
I first went to the official docs for inline switch prongs. I remember the code being a bit confusing and thinking it was not quite applicable. I was thinking “How do I do that for a set of types?”. I also remember being quite confused with where I would put the inline prongs. Every idea became a catch 22 situation coming back to the fact that I would have to edit int_type
at runtime, an impossibility because this would cause int_type
to rely on runtime control flow.
This led me to googling “how to use inline zig”, which led me to Zig Comptime - WTF is Comptime (and Inline) - Zig NEWS. This article really helped me grasp the fundamentals of comptime
, but left me hanging a bit on inline
. Here were my main takeaways:
- A function taking in any argument as
comptime
becomes a comptime
function.
- My functions like
increment
and load_int
take in comptime type
’s, making them comptime
functions.
-
My friend [InKryption] put it succinctly in a quote: “comptime exists in a sort of pure realm where I/O doesn’t exist.”
- Since the types passed to those functions were dependent on runtime input, I would need to refactor them somehow.
- You can use the
inline
keyword on a function to tell the compiler to copy the function contents wherever the function is called instead of calling the function, similar to macro functions in C.
While I went in wanting to understand inline
, I came out realizing that I was going at this fundamentally wrong. It was less that I didn’t understand what comptime
was, it was more that I was unaware how it propagated.
I now know that changing the type passed using var int_type
in any shape or form made it unresolvable when passed to functions like increment(self: *Self, int_type: type)
. This is because increment
’s true signature should be increment(self: *Self, comptime int_type: type)
.
As a retrospective aside, I actually think that the compiler errors and/or the language design failed me here. The compiler only errored on this function where it was called in interpret
. I think it should’ve errored on the function for not explicitly marking the int_type
argument as comptime
. Maybe this is because type
is always comptime
, and therefore does not require the keyword. I can understand wanting to type less, but this seems against the ethos of the language which favors readability over writability. Either that, or I’m remembering the compiler errors wrong.
4. Wtf do be that union(enum)
tho frfr???
After that, I returned to the official docs on inline switch prongs. This is where I looked back at test_inline_else.zig
and noticed something I had never seen before:
const AnySlice = union(enum) {
a: SliceTypeA, // ^--- what does this mean??
b: SliceTypeB,
c: []const u8,
d: []AnySlice,
};
I already had a grasp on what a union was from C. I also know that optionals and errors in zig are implemented as unions. However, I had never run into a union(enum)
. I had to Ctrl-f for this pattern through the docs to find out what this was.
Eventually, I stumbled upon the tagged union. This is where everything started to really click. I realized that the tags could be runtime resolvable enum
’s that specify what union
field to use.
What was a bit disappointing was that none of the code snippets under tagged union really demonstrated how hand-in-hand this plays with inline switch prongs. For that, I had to return back to inline switch prongs. Specifically, it didn’t all click until I looked again at test_inline_else.zig
with the knowledge that AnySlice
was a tagged union.
This was where I truly realized that inline
also implies comptime
in a way, meaning that the captured value (in this case, slice
) is available at comptime
. This is because a prong is generated at comptime
for every possible tag of any
:
fn withSwitch(any: AnySlice) usize {
return switch (any) {
// With `inline else` the function is explicitly generated
// as the desired switch and the compiler can check that
// every possible case is handled.
inline else => |slice| slice.len,
};
}
Now, based on the comment from the docs, I could’ve potentially realized this sooner. This is especially true when the following is the first line under the section title:
Switch prongs can be marked as inline
to generate the prong’s body for each possible value it could have, making the captured value comptime.
However, I was mostly looking at the code. The comment featured in the code snippet above implies that slice
is comptime
because the compiler is checking for every possible case. I think if this comment explicitly mentioned that the captured slice
value was now a comptime
value, everything would’ve clicked a bit sooner.
Conclusion
I think tagged unions should be discussed more in documentation as a way to select a type to use for certain operations during runtime. In addition, there should be more emphasis placed on how inline
can be used as a way to resolve runtime values to comptime values.
Sorry for the lack of convenience links when I reference the docs. As a new user I only get two links per post. Also, sorry the post got so long.
Thanks for your help!