I’m writing some code for the Game Boy Advance, and I had a bit of unnecessary trouble debugging this snippet of code:
Snippet
// in `main`
var key_state: input.State = .{};
var frame: usize = 0;
var tile_id: usize = 0;
var x: i32 = 0;
var y: i32 = 0;
while (true) {
video.vsync();
key_state.poll();
x += 2 + @as(i32, key_state.triHorizontal().int());
reg.bgX_offset[0].x = @intCast(gba.clamp(x, 0, 511));
y += key_state.triVertical().int();
// ... some more logic
// frame finish
video.vsync();
ObjAttributes.copy(gba.obj.obj_attrs, &oam_mem);
frame += 1;
if (frame == 60) {
frame = 0;
}
}
(The types and implementations of most of the variables and functions aren’t really relevant to the issue)
The code is optimized with ReleaseSmall optimizations, which is very relevant.
I noticed while running this in mGBA that it was calling video.vsync(), and then falling out of the function into some data, without any kind of loop. This is the generated LLVM IR for this loop:
call fastcc void @video.vsync()
unreachable
It’s clear now that the loop is being optimized out entirely, save for the first call to video.vsync(). After narrowing down the cause, I found the the key_state.triHorizontal() and key_state.triVertical() calls were the issue. Further inspection reveals that the key_state.triHorizontal() call fully expands out to:
@intFromEnum(@as(gba.utils.TriBool, @enumFromInt(
@as(i32, @intCast((@as(u32, key_state.cur) >> @intCast(reg.structs.KeyInput.bit("left"))) & 1)) -
@as(i32, @intCast((@as(u32, 0) >> @intCast(reg.structs.KeyInput.bit("right"))) & 1)),
)));
The actual issue here is the implementation of KeyInput.bit:
pub fn bit(comptime field_name: []const u8) u16 {
return 1 << @bitOffsetOf(@This(), field_name);
}
We can see that the returned value is actually a bit mask of the specified field in KeyInput, but it’s being used as a bit offset in the triHorizontal() code. Adding comptime to both calls to KeyInput.bit in triHorizontal() and triVertical() lets the compiler actually see in comptime that the bit offsets are incorrect, and complain that they don’t fit in a u5, making the underlying problem obvious.
Had I been compiling similar code on a non-freestanding operating system with runtime safety enabled, this would have triggered a runtime panic. However, since Zig optimizes heavily with ReleaseSmall, it assumes the rest of the loop is unreachable after this and doesn’t bother generating it, even though it also doesn’t insert a safety check, meaning the code is executed unchecked (expected with ReleaseSmall optimizations), and then falls out of main because the following code didn’t end up being actually unreachable.
I’m wondering if there’s some way to avoid these opaque programming errors not being reported by the compiler despite it knowing that the logic is invalid at compile time (hence the unreachable in the generated LLVM IR).
(It should also be stated that I’m using Zig version 0.17.0-dev.387+31f157d80, since that’s the latest version for which zls provides build system integration)
Thanks for any help.