Zig Compiler's Behavior in Regards to "Strict Aliasing"

haydenridd · February 19, 2025, 9:37pm

One of the most confusing and widely broken rules in C and C++ is the “Strict Aliasing” rule. The following Gist does a really good job of explaining what it is, and why operations that seem like they “should” work (and often do!) are actually invoking undefined behavior.

The given example of casting an int * to a float * is a great one, as naively one might assume that as long as the “underlying bytes” of the int are also a valid float value, all should be well.

What is the Zig compilers approach to “strict aliasing” as described?

The docs for @bitCast seem pretty clear that this is a valid way to type pun similar to how using memcpy() in C/C++ is one of the very few valid ways to “correctly” type pun. The docs for @ptrCast are a little less specific about what you are and aren’t allowed to do in regards to types.

For instance, to use the example from the linked article on strict aliasing, is this snippet well defined?

const std = @import("std");

fn foo(f: *f32, i: *i32) i32 {
    i.* = 1;
    f.* = 0.0;
    return i.*;
}

pub fn main() !void {
    var x: i32 = 0;
    std.debug.print("x: {d}\n", .{x});
    x = foo(@ptrCast(&x), &x);
    std.debug.print("x?: {d}\n", .{x});
}

In all release modes as of 0.13.0 this produces the expected output of:

x: 0
x?: 0

Which stands distinct from the C/C++ example which due to the UB set x to 1.

LucasSantos91 · February 20, 2025, 12:25am

Zig doesn’t do strict aliasing, though, last I heard, they haven’t ruled it out.
Check out this thread.

andrewrk · February 20, 2025, 4:21am

Aliasing in general is legal in Zig, however, some types have well-defined memory layout while others do not. For example, extern struct has well-defined memory layout, but struct does not.

Therefore, if you alias two things which have well-defined memory layout, the code is well-defined.

However, it is not well-defined to load memory through a differently typed alias when the type of memory in question has a non-well-defined memory layout, or the alias has a non-well-defined memory layout.

Where, exactly, to put the “illegal behavior” stamp is not decided yet, but probably it will be at the @ptrCast because that is where it is cheapest to put a safety check, and there is a plan for @ptrCast safety.

haydenridd · February 20, 2025, 4:29am

Makes sense, appreciate the explanation! Very excited about the proposed @ptrCast safety checks. I write firmware and so there’s naturally a lot of memory “reinterpretation” that goes on.

andrewrk · February 20, 2025, 5:50am

Interesting, would you build your firmware in ReleaseSafe mode? Or perhaps you can afford Debug mode for development, so these safety checks would help while developing, but then once you’ve finished QA/testing, you would ship ReleaseSmall in production? I’m curious about the specifics.

floooh · February 20, 2025, 7:52am

FWIW there’s an important distinction between C and C++ here: in C, type punning via a union is legal, in C++ it’s not (but C++ gained std::bit_cast instead, so memcpy should never be needed in either language to ‘bit cast’ between types (instead use a union in C and a std::bit_cast in C++ - the only dilemma is for code that’s supposed to be both valid C and C++ code).

matklad · February 20, 2025, 9:56am

Am I correct that this case is not about just pointer casing, but actual loads and stores? That pointer casts by themselves are fine as long as, if you load the thing as Foo, it must have been stored as Foo?

In other words, that the following code which could be found, in, eg, some C APIs, is valid:

const ZigStruct = struct {
    func: *const fn (u32) void,
    int: u64,
};

pub const Alignment = @alignOf(ZigStruct);
pub const Opaque = [@sizeOf(ZigStruct)]u8;

pub export fn init(c: *align(Alignment) Opaque) void {
    const z: *ZigStruct = @ptrCast(c);
    z.* = .{ .func = func, .int = 92 };
}

pub export fn use(c: *align(Alignment) Opaque) void {
    const z: *ZigStruct = @ptrCast(c);
    _ = (z.func)();
}

fn imagine_this_being_c_code() void {
     var c align(Alignment) = @as(Opaque, undefined);
     init(&c);
     use(&c);
}

Here, we are casting [_]u8 (well-defined layout) to ZigStruct (not well-defined layout), but, because we make sure that, whenever we load a ZigStruct through u8 pointer, we’ve previously stored ZigStruct, this should be fine, right? Right?

haydenridd · February 20, 2025, 4:14pm

Honestly unless I was super constrained I’d ideally ship production in ReleaseSafe! That way a custom panic handler can do things like:

Write a stack trace to some non-volatile memory before gracefully resetting
Potentially log/upload stack trace to cloud if it’s a connected device

I’d much rather just keep the safety checks in if their cost isn’t affecting performance/code size enough to matter to the given use application. On the odd chance an assert or safety check gets tripped even after extensive QA I’d much rather have my well defined path kick in than just keep on charging.

haydenridd · February 20, 2025, 4:18pm

Ugh, so using a union in C to type pun is sort of allowed. It’s maybe the single most confusing piece in the C standard in my opinion lol. This comment on another thread on Ziggit is pretty good at explaining:

Basically in practice, it always works to type pun through a union. But it’s sort of murky whether or not it’s completely kosher via the standard. You are completely correct though that as of C++20 std::bit_cast is the “canonical” way to type pun in C++ and should be used over memcpy.

andrewrk · February 20, 2025, 9:30pm

You basically are demonstrating how memory allocation works, which, yeah, is clearly not going to regress based on lang spec rules.

I struggled with the wording of this:

Hopefully whoever words it in the spec can do a better job than me

But anyway if you look at the plans for pointer casting safety, your example demonstrates it perfectly:

The cast in init will have to change to be @ptrCastUndef - the one that sets the bytes to undefined (and sets the secret safety type tag).
The cast in use is the one that checks that the secret safety type tag is correct.

floooh · February 20, 2025, 10:51pm

The union type punning behaviour has been clarified in the C99 standard, e.g. see here under Explanation:

https://en.cppreference.com/w/c/language/union

If the member used to access the contents of a union is not the same as the member last used to store a value, the object representation of the value that was stored is reinterpreted as an object representation of the new type (this is known as type punning ). If the size of the new type is larger than the size of the last-written type, the contents of the excess bytes are unspecified (and may be a trap representation). Before C99 TC3 (DR 283) this behavior was undefined, but commonly implemented this way.

(e.g. union type punning is not ‘sort of’ but ‘explicitly’ allowed in C since C99 - it was only murky in C89)

FWIW I think that being able to have different views on the same memory is extremely valuable, even if it means that load/stores cannot be ‘optimized away’ by the compiler.

haydenridd · February 21, 2025, 6:12pm

Ahhh okay got it appreciate the clarification!

Validark · February 22, 2025, 3:29pm

Personally, I see “ReleaseSafe” as “FastDebug”. I usually don’t see the point in the compiler automatically inserting checks that will cause a panic if you get an overflow or end up with an invalid member of an enum. Instead of writing cowboy code and expecting the compiler to catch issues, why don’t YOU write the checks in your code? You can check for overflow, you can check for invalid enum casts, you can check if some suspiciously generated index is larger than the buffer it looks inside.

Grep for + and start handling overflows. And you can handle those errors in the way that makes the most sense for each one.

I suppose it is a nice feature of the language that these can be generated for you and you can just use a custom panic handler. Still, I am suspicious of the idea that programming-language-as-a-safety-framework is the best solution to this problem. I’d much prefer very conservative programming practices where you can actually see the branches and error handling rather than an implicit global panic handler that might be set off at any time by code that broke its contractual obligations. I feel like ReleaseSafe is throwing the baby out but keeping the bathwater. I’d ship ReleaseSmall if I were you.

loremayer · February 22, 2025, 9:32pm

I disagree with this. Crashing is still undesirable, so I still write proper checks in ReleaseSafe mode. However, I do make mistakes sometimes, everyone does. ReleaseSafe turns a catastrophic vulnerability caused by a programming mistake into a safe crash. Of course, it doesn’t catch all memory safety issues, but it catches a lot. Also, it makes a negligible difference in performance.