I noticed while messing around with a custom _start function and implementing syscall wrappers on macOS (aarch64-macos is the target) that -O ReleaseSmall causes some odd optimizations that seem incorrect.
Here’s a minimal example:
pub export fn main() u8 {
_ = myWrite(.out, "hi", 2);
return 0;
}
pub const Fd = extern struct {
num: i32,
pub const out: Fd = .{ .num = 1 };
};
const write_sys = 0x200_0004;
pub export fn myWrite(fd: Fd, ptr: [*]const u8, n: usize) usize {
return syscall3(@intCast(fd.num), @intFromPtr(ptr), n, write_sys);
}
pub inline fn syscall3(arg0: usize, arg1: usize, arg2: usize, sys: u64) usize {
return asm volatile ("svc #0x80"
: [ret] "={x0}" (-> usize),
: [sys] "{x16}" (sys),
[arg0] "{x0}" (arg0),
[arg1] "{x1}" (arg1),
[arg2] "{x2}" (arg2),
: .{});
}
This makes an inline syscall3 function to call a syscall with three arguments, and uses it to wrap the write syscall as myWrite.
Reading the generated assembly on Compiler Explorer without optimizations, you should see this snippet in the main function:
adrp x8, ___anon_1943@PAGE
add x8, x8, ___anon_1943@PAGEOFF
ldr x0, [x8]
adrp x1, ___anon_1947@PAGE
add x1, x1, ___anon_1947@PAGEOFF
mov w8, #2
mov x2, x8
bl _example.myWrite
and this one outside defines those ___anon values:
___anon_1943:
.long 1
___anon_1947:
.asciz "hi"
The first snippet loads Fd.out into the first argument to myWrite correctly, and the rest loads the string and other arguments and then calls myWrite.
Now, if we look at the generated assembly with ReleaseSmall optimizations on Compiler Explorer, we see this as the contents of the main function:
adrp x1, ___anon_1947@PAGE
add x1, x1, ___anon_1947@PAGEOFF
mov w16, #4
movk w16, #512, lsl #16
mov x0, #0
mov w2, #2
svc #0x80
mov w0, #0
ret
This is syscall3 and myWrite inlined with the passed arguments. It puts the string into the second argument, the syscall number into w16, and the number of characters into the third argument, as it should.
However, the first argument x0 gets #0, which is telling the write syscall to write the string to the stdin file descriptor. On my macOS machine, running this code without optimizations correctly writes to stdout and displays on the terminal, and running it with optimizations also displays to the terminal, but the string is written to stdin, which is not the same behavior nor what I meant it to do.
I checked on Compiler Explorer for a few other versions, and the equivalent code seems to have the same apparent miscompilation at least as far back as Zig 0.10.0, though in that version it does mov x0, xzr rather than mov x0, #0, both of which are ways to zero a register.
Is this use case not supported, or is this a compiler bug, or something else?