Duplicate functions under ReleaseFast

I noticed something odd while checking the output of the following code at godbolt:

pub fn copy1(s1: []const u32, s2: []u32) void {
    for (s1, 0..) |x, k| {
        s2[k] = x;
    }
}

pub fn copy2(s1: []const u32, s2: []u32) void {
    for (s1, s2) |x, *p| {
        p.* = x;
    }
}

export const exports = [_]*const anyopaque{
    @ptrCast(&copy1),
    @ptrCast(&copy2),
};

At -O ReleaseSmall, I got the following:

example.copy1:
        xor     eax, eax
.LBB0_1:
        cmp     rsi, rax
        je      .LBB0_3
        mov     ecx, dword ptr [rdi + 4*rax]
        mov     dword ptr [rdx + 4*rax], ecx
        inc     rax
        jmp     .LBB0_1
.LBB0_3:
        ret

exports:
        .quad   example.copy1
        .quad   example.copy1

That’s the expected result, since copy1() and copy2() should be exactly the same. When optimization was set to ReleaseFast, however, I see both copy1 and copy2 in exports. A diff of the asm listing showed that aside from different labels used, the two functions are identical.

I’m wondering whether this is due to a bug somewhere.

3 Likes

I’ve seen this a few times where the exact same function exists. I haven’t checked, but are both functions aligned the same?

I’ve also seen copy2 compiled down to a single jump to copy1 when they didn’t inline.

(Also, release small sometimes generates better, faster code when something like an inner loop is poorly aligned in fast and unrolling compounds the problem on typical machines. New Intels are supposed to be much less sensitive to this issue,)

2 Likes