Why does compiler generate `seto` instruction but not use it?

I saw this looking into something else, and have been trying to figure it out. I’ll be the first to admit that i don’t understand every trick and incantation of assembly. Indeed my assembly skills are quite poor.

Given the following zig program

export fn square(num: i32) i32 {
    return num * num;
}

You will get the following assembly output:

square:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 288
        mov     dword ptr [rbp - 284], edi
        lea     rax, [rbp - 280]
        mov     qword ptr [rbp - 16], rax
        mov     qword ptr [rbp - 8], 32
        mov     qword ptr [rbp - 24], 0
        imul    edi, edi
        mov     dword ptr [rbp - 288], edi
        seto    al
        jo      .LBB0_1
        jmp     .LBB0_2
.LBB0_1:
        lea     rdi, [rbp - 24]
        call    "debug.FullPanic((function 'panic')).integerOverflow"
.LBB0_2:
        mov     eax, dword ptr [rbp - 288]
        add     rsp, 288
        pop     rbp
        ret

(Note, that this is the default function used in godbolt when you open a zig file)

My question is why does the compiler issue a seto al instruction. I know that seto checks for the overflow bit in the eflags register, but al is never used. jo jumps based off of the eflags register, and after that eax is overwritten with the result of the multiplication.

Note that the instruction goes away in release modes.
I think it is for handling @mulWithOverflow.

3 Likes

More mysterious are these seto’s, appearing when optimize is ReleaseSmall:

const std = @import("std");

export fn overflow(num: i32) u8 {
    _, const b = @mulWithOverflow(num, num);
    return b;
}
overflow:
        push    rbp
        mov     rbp, rsp
        imul    edi, edi
        seto    al
        seto    byte ptr [rbp - 4]
        seto    byte ptr [rbp - 5]
        pop     rbp
        ret

This is @mulWithOverflow. Since it’s an intrinsic, it is atuomatically inlined, even in debug builds. It returns two values. The product is being returned on the stack and the overflow bit is being returned in al. The compiler realized that the overflow bit would be set and based the jump on that. But since debug builds don’t do optimizations, it didn’t realize that seto al was unnecessary.

Same thing here. The return value from @mulWithOverflow is set in seto al. But the compiler also needed to set the value of _. Just because you’re discarding it, doesn’t mean that the compiler can discard it, as that would require analysis. The compiler could have copied al into _, but instead it called seto again. I don’t know why it did it twice, though.

2 Likes

But what’s being discarded is the result of the multiplication. The overflow flag is being returned. And it’s in the right register already.

1 Like

Ok, it being an artifact from inlining makes sense. And I tried to see the output in other Release Modes in godbolt, but I guess I was doing it wrong. Testing locally I see that ReleaseSafe doesn’t do the seto.