The packed union versus enum

I wrote some other related stuff on the subject, but here we go once more…

In my chessprogram I use packed unions because there are a lot of arrays which I have to easily index by enum.
But I would really like to use a raw enum, because I also coded a custom generic array (which is not the subject here). And because there are some other inconvenient effects regarding readability. I would prefer a simple enum.

The “problem” is that the raw enum version generates an extra and instruction in array_index()

Case 1: The raw enum version:

const Square = enum(u6) {
   a1, b1, c2, // etc. until h8 = 63

    fn array_index(self: Square) usize {
        return @intFromEnum(self); // assembly generate an extra `and 63` instruction.
    }
};

Case 2: My used packed union trick is faster:

const Square = packed union {
    const Enum = enum(u6) {
        a1, b1, c2, // etc. until h8 = 63
    };
    e: Enum,
    u: u6,

    fn array_index(self: Square) usize {
        return self.u.; // just one `movzx` instruction.
    }
}

Now this is not the only entity for which I use this “u-indexing” trick. So we can imagine that there is a slowdown noticable when converting everything to clean enums.

The size u6 is kind of a requirement orelse I have to rewrite a lot of code.

  1. Why is the extra “and” in case 1 and not in case 2?
  2. Is it possible to code it cleanly such that there is no useless “and” instruction?
  3. Any other thoughts welcome.

BTW: I use zig 0.15.2.

3 Likes

for a chess program, u6 seems about as clean to me as you can get, right? i guess maybe there’s some concern about using a u6 as “just a number” and vice versa?

I’ve never used packed unions, and want to learn from this. I see in docs for packed structs (with packed-union fields) that “A packed union field uses exactly the bit width of the union field with the largest bit width,” but the bit of documentation dedicated to packed union indicates that “All fields in a packed union must have the same @bitSizeOf“. I see your e and u are the same bitSizeOf, so this is a little beside the point, but, as I said, I’d like to learn from this.

Anyway, what I don’t see is how you use this. I’m guessing that somewhere some function takes a Square that is actually an e, because that’s easier for the caller to reason about (knight to d4), but, the function that takes this, sets square.u = @intFromEnum(input.e), or something like that? If I’m not in left field, is the situation that you have, with “Case 2”, a front-loaded and 63 instruction in the assembly, with a clean hotpath, but if you change to the “Case 1” scenario, you have a hotpath with tons of @intFromEnum calls (every call to Square.array_index())?

I reproduced this in Compiler Explorer. (Change the value in switch to swap between variants.)

I think this must be a quirk caused by the conversion from a non-power-of-two integer type, and the fact that this doesn’t happen in the packed union in 0.15.2 seems to just be an outlier there. Interestingly, your packed union variant indeed does not emit the extra and eax, 64 with Zig 0.15.2, but the instruction does pop up when compiled with zig trunk (Compiler Explorer doesn’t have 0.16.0 yet).

The reliable way to mitigate the extra instruction in both 0.15.2 and trunk seems to be to change the type to enum(u8), but of course if you’re using it in packed structs, you’re going to lose the memory saving properties of the u6. On the other hand, if you’re bitpacking, there are inevitably going to be plenty of ands and bit shifts emitted anyway.

In any case, bit operations are basically the cheapest thing a CPU can do, so the question is: does this even matter in the grand scheme of things? My advice would be to try and measure whether this even affects the speed of your program meaningfully.

10 Likes

Good advice. Very true.

I noticed during experiments, when trying to optimize A some other optimization B can fall down in the grand scheme.
I will try and experiment with “unpacking” some structs.

1 Like

Yeah, can confirm. I tried every kind of type punning known to man, and I could not get the and instruction to go away. I think the unused bits of an enum are undefined, while the unused bits in a packed union are defined as 0, so the compiler introduces a cleanup step of these bits.

7 Likes

BTW: I was fiddling around with Zig 0.16.0 where packed unions are directly comparable and ‘switchable’.
But they are not accepted by AutoHashMap. Is that an omission or on purpose because of other reasons?

wdym “not accepted”?

IIRC this is just an oversight in AutoArrayHashMap and there’s an issue for it.

1 Like

the packed union is not hashable as key

But it’s backing int would be, if implemented.