Weird bit-twiddling coercion behavior

I’m slowly going insane as to whether or not this is a legit bug or if I’m just holding things wrong.

Consider the following C-code:

#include <stdio.h>
#include <stdint.h>

int main(int argc, char **argv) {
    uint8_t a = 0x30;
    uint16_t b = 0xCAFE;
    a = (a & 0xF0) | (b & 0x0F);
    printf("%02X\n", a);
    return 0;
}

Which will correctly print out 3E, whereas the equivalent Zig-version:

const std = @import("std");

pub fn main() void {
    var a: u8 = 0x30;
    const b: u16 = 0xCAFE;
    a = (a & 0xF0) | (b & 0x0F);
    std.debug.print("{X}\n", .{a});
}

…makes the compiler choke and die over the fact that b is a u16. Which kinda makes sense if you consider bit twiddling to be like any other integer arithmetic. But here I mask off the least significant nibble from the least significant byte with b & 0x0F, which could (should?) naturally coerce into a u8 due to the other operations and the assignment.

I’m getting ahead of myself. For the sake of the argument, let’s try coercing b to be a u8:

    a = (a & 0xF0) | (@as(u8, b) & 0x0F);

Compiler dies with error: type 'u8' cannot represent integer value '51966', which, yeah, makes perfect sense.

How about coercing the operation? Maybe that will hint the compiler to do the right thing?

    a = (a & 0xF0) | @as(u8, b & 0x0F);

It works! Huh. I’m feeling adventurous now and decide to refactor the code into its own function where b is an argument:

const std = @import("std");

fn argh(b: u16) void {
    var a: u8 = 0x30;
    a = (a & 0xF0) | @as(u8, b & 0x0F);
    std.debug.print("{X}\n", .{a});
}

pub fn main() void {
    argh(0xCAFE);
}

Functionally equivalent, but the compiler dies with error: expected type 'u8', found 'u16'. Argh indeed. To make things downright bizarre, inlining the function or declare b comptime magically makes the compiler happy again.

What’s going on here? Surely I can’t be the first one to stumble upon this?

@as(u8, b & 0x0F) only works in the first example because the compiler does the calculation b & 0x0F at comptime and then determines that the resulting integer can indeed fit into a u8.
This is possible because b is a const that is initialized by a comptime-known value (0xCAFE).
This breaks when you create a function, because Zig assumes function parameters to be runtime-known (otherwise you also easily explode compile times by passing comptime-known values to functions)

Now for runtime-known values the compiler is not smart enough to figure out that b & 0xF can fit into a u8. Instead you have to help it out using an intCast to assert that the value is fitting into a u8: @as(u8, @intCast(b & 0x0F))

3 Likes

What you’re doing, semantically, is truncating the high bits. So use @truncate:

fn argh(b: u16) void {
    var a: u8 = 0x30;
    a = (a & 0xF0) | @truncate(b & 0x0F);
    std.debug.print("{X}\n", .{a});
}

I think @truncate has the right result location here, you might have to force the issue with an @as cast around the @truncate, but I don’t think so.

3 Likes

Oooh! That’s it! I figured that it would be able to infer that b is essentially a compile-time constant from the call. Many thanks!

There are two ways to get this, if it’s what you want:

inline fn argh(b: u16) void {
    var a: u8 = 0x30;
    a = (a & 0xF0) | @as(u8, b & 0x0F);
    std.debug.print("{X}\n", .{a});
}

“Semantic inlining” will let the comptime-known-ness follow “b”. The other:

fn argh(comptime b: u16) void {
    var a: u8 = 0x30;
    a = (a & 0xF0) | @as(u8, b & 0x0F);
    std.debug.print("{X}\n", .{a});
}

This enforces that b must be comptime-known.

1 Like

It’s unfair that I can’t select multiple solutions, since this works as a charm too. (Though it feels a bit iffy manually truncating the bits, even though that’s the intention. :grinning_face_with_smiling_eyes:)

1 Like

Thanks! I figured that it was related to values being known at compile-time. It was just the runtime behavior that made me confused.

1 Like