Why doesn't Zig optimize this, whereas the C compiler does?

In C89 the rule was completely clear:

if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined

As often happens, the specific standard-ese on this topic has muddied over time. As a matter of practice, no compiler would dare break type punning through unions, because too many programs depend on it functioning correctly.

The footnote you’re referring to, which does expliticly mention type punning, points to this section:

When a value is stored in a member of an object of union type, the bytes of the object representation
that do not correspond to that member but do correspond to other members take unspecified values.

Which is one of those aggravating negative-space statements which is just less clear than C89 was. But we can connect all of the dots: two members of a union both have well-defined memory layouts (essential in C), so the non-padding bytes common to both unions have a specified value, even when accessed through the non-active tag.

So not quite UB, although it’s possible to reach UB by, in particular, reading a padding byte from a field of another member of the union. But if all members of the union are dense, with no padding, then behavior when accessing any one through any other is implementation defined, which in practice means that it’s predictable given the known endianness of the platform.

3 Likes