Help with bit manipulations in ziglings

I’m playing with ziglings, 098_bit_manipulation2.zig, the correct code should look like this

const std = @import("std");
const ascii = std.ascii;
const print = std.debug.print;

pub fn main() !void {
    print("Is this a pangram? {?}!\n", .{isPangram("The quick brown fox jumps over the lazy dog.")});
}

fn isPangram(str: []const u8) bool {
    if (str.len < 26) return false;
    var bits: u32 = 0;
    for (str) |c| {
        if (ascii.isASCII(c) and ascii.isAlphabetic(c)) {
            bits |= @as(u32, 1) << @truncate(ascii.toLower(c) - 'a');
        }
    }

    return bits == @as(u32, (1 << 26) - 1);
}

but when I removed the builtin @truncate the compiler gave me an error

exercises/098_bit_manipulation2.zig:56:54: error: expected type 'u5', found 'u8'
            bits |= @as(u32, 1) << (ascii.toLower(c) - 'a');
                                    ~~~~~~~~~~~~~~~~~^~~~~
exercises/098_bit_manipulation2.zig:56:54: note: unsigned 5-bit int cannot represent all possible unsigned 8-bit values

I don’s understand why is this, why the compiler cares about if the type of the operand is u5 or not?
And since the return type of ascii.toLower is a u8, why it knows a u5 is needed here

From the language reference for Bit Shift Left:

b must be comptime-known or have a type with log2 number of bits as a.

My understanding is that this is to help you avoid trying to shift more than the type allows.

For example, if you tried to do either of the following, you get a compile error:

// error: type 'u5' cannot represent integer value '32'
const foo = @as(u32, 1) << 32;

// error: shift amount '33' is too large for operand type 'u33'
const bar = @as(u33, 1) << 33;

These compile errors make shifting too much impossible for:

  • comptime-known shift amounts, or
  • types where log2(bit_count) has an exact answer (u8, u16, u32, u64, etc)

However, since e.g. log2_int_ceil(33) is 6 (meaning u33 needs a u6 for its shift amount), it is possible to shift more than the type allows when using runtime shift values and types with an inexact log2(bit_count) result:

var a: u6 = 34;
_ = &a; // this is just a way to force `a` to be a runtime value
const b = @as(u33, 1) << a;

Running this gives a runtime panic in safe modes:

thread 21600 panic: shift amount is greater than the type size

So, requiring the shift amount to have log2(bit_count) bits is not totally foolproof, but it can turn some potential runtime panics into compile errors.

7 Likes

I see, thank you for your detailed explanation :grinning:

2 Likes

The idea behind such bit manipulation is that you know the result. And the result is all 26 bits set. You can write this as “1111…” or more briefly as a hex number “0x…”. That’s all.

Also, it is totally foolproof when the shifted value is a power-of-two unsigned integer type, such as u16, u32, and usize on all currently-supported platforms. That’s the most common case for bit shifting, so the restriction works better in practice than in theory.

1 Like

Right, I may have unintentionally made that part unclear. Edited my comment to hopefully make it a bit clearer.

1 Like