Prevent Zig from reordering instructions

Is there any way currently to prevent the Zig compiler from “optimizing” by reordering function calls?

const res = funcWithSideEffects();
dealWithSideEffects();

return res;

In this snippet I’d expect the first function to always be executed first, which happens in Debug mode but in certain architectures in release does not.

Note: the non-generic context has to do with calling a FPU instruction called rint that can set an exception flag on the FPU which Zig does not known about, so it thinks this rint function is pure.

1 Like

You can use a Condition to signal and wait for specific conditions to become true. Conditions are used with a Mutex for waiting.

var m: std.Io.Mutex = .init;
var c : std.Io.Condition = .init;

// mutex must be locked for the duration of both operations
m.lockUncancelable(io);
defer m.unlock(io);

const res = funcWithSideEffects(); // calls c.signal(io) after initialization
dealWithSideEffects(); // calls c.waitUncancelable(io, &m) before use

Normally someone uses atomics and memory barriers to deal with the reordering of the instructions. But this is not about accessing memory, it is about hardware initialization.

EDIT: this cannot fix the problem that compiler rearranges the statements; it solves the synchronization problem when the functions are called in different threads

1 Like

A way to force the compiler to correctly order the instructions is to use std.mem.doNotOptimizeAway but you must be sure that the CPU does not reorder them.
See: How to use std.mem.doNotOptimizeAway?

I think optimize away is to prevent functions from being removed by the compiler. It wouldn’t prevent a function to be re-ordered to be executed later than it should

Yes, the trick is to return something from the first function and use it in the second, effectively forcing the ordering.
Unfortunately the compiler is smart and can optimize away the value, unless you tell it to not optimize away the values.

1 Like

I don’t know the answer to your question but maybe you can set an atomic global variable (called rint_called or something) and check it (and branch on it) in the second function?

(Sorry about the above, accidentally hit send on an incomplete message.)

I need a bit more context from your actual code to answer this question properly—I presume you’re calling this FPU instruction in an asm block, and then reading the status word in another asm block? In that case, I think you need to make sure that both blocks are marked as volatile, because they affect global state. If an asm block is not marked as volatile, the compiler can assume that it is a pure operation of its inputs, with no side effects, sans any outputs and clobbers you specify. This means it can reorder it, duplicate it, or delete it if unused.

The previous answers don’t seem right to me. I have no idea what Mutex or Condition have to do with any of this (they synchronize between async/concurrent tasks, which is completely different from specifying data dependencies to the optimizer!) and std.mem.doNotOptimizeAway is almost never a good suggestion (it has very vague semantics and is usually a hack).

8 Likes

I didn’t want to influence the responses to see if there was a more general solution that would fit other problems but I should have given the right context straight away.

I’ve been trying to port a libc function called nearbyint which is just rint but without raising / storing FPU exceptions in case of errors. Here’s the PR https://codeberg.org/ziglang/zig/pulls/35302

What prompted the question was https://codeberg.org/ziglang/zig/pulls/35302#issuecomment-16201595.

I have an idea on how to force the compiler to execute these specific two C float functions sequentially, applying a mask to the result of rint and use that mask in feclearexception (been AFK so haven’t actually tested anything yet).

I wanted to see if there were other approaches I was unaware of take that would feel less hacky and more intentional.

Does the volatile keyword also work on zig functions and C functions?

The current rint function exists in Zig’s math library, I think it gets lowered to a singular instruction in some targets. The other 2 functions are libc functions still in C

This should work:

const res = funcWithSideEffects();
asm volatile ("");
dealWithSideEffects();

return res;

This should not work, there is nothing that prevents the compiler to rearrange the calls.

doNotOptimizeAway introduces a volatile asm block that references either a register or a memory location depending on the parameter.

dealWithSideEffects can receive the res as parameter and call doNotOptimizeAway(res); to force a dependency of the second function to the result created by the first function.

For academic’s sake I’ll present my findings. This is the correct and original function which gets re-ordered.

fn nearbyintGeneric(comptime T: type, rint_func: fn (T) callconv(.c) T, x: T) T {
    const e = std.c.fetestexcept(std.c.FE_INEXACT);
    const result = @trunc(rint_func(x));
    if (e == 0) {
        _ = std.c.feclearexcept(std.c.FE_INEXACT);
    }
    return result;
}

The re-ordering in -OReleaseSmall puts the rint call after the if block. I made the attempt with asm volatile ("") but the instructions were still re-ordered.

~/repo/zig$ git diff lib/c/math.zig
diff --git a/lib/c/math.zig b/lib/c/math.zig
index efbe31c1c6..7d304303fa 100644
--- a/lib/c/math.zig
+++ b/lib/c/math.zig
@@ -399,6 +399,7 @@ fn rintl(x: c_longdouble) callconv(.c) c_longdouble {
 fn nearbyintGeneric(comptime T: type, rint_func: fn (T) callconv(.c) T, x: T) T {
     const e = std.c.fetestexcept(std.c.FE_INEXACT);
     const result = @trunc(rint_func(x));
+    asm volatile ("");
     if (e == 0) {
         _ = std.c.feclearexcept(std.c.FE_INEXACT);
     }

I also tried my original approach which was to re-use the result value in the feclearexcept function but the function is still re-ordered

~/repo/zig$ git diff lib/c/math.zig
diff --git a/lib/c/math.zig b/lib/c/math.zig
index efbe31c1c6..508fa8a519 100644
--- a/lib/c/math.zig
+++ b/lib/c/math.zig
@@ -399,8 +399,10 @@ fn rintl(x: c_longdouble) callconv(.c) c_longdouble {
 fn nearbyintGeneric(comptime T: type, rint_func: fn (T) callconv(.c) T, x: T) T {
     const e = std.c.fetestexcept(std.c.FE_INEXACT);
     const result = @trunc(rint_func(x));
+    const cast: f32 = @floatCast(result);
+    const masked: c_int = @bitCast(cast);
     if (e == 0) {
-        _ = std.c.feclearexcept(std.c.FE_INEXACT);
+        _ = std.c.feclearexcept(std.c.FE_INEXACT | (masked & std.c.FE_INEXACT));
     }
     return result;
 }

With some print statements, this is the output:

~/repo/zig$ git diff lib/c/math.zig
diff --git a/lib/c/math.zig b/lib/c/math.zig
index efbe31c1c6..b928a50f4e 100644
--- a/lib/c/math.zig
+++ b/lib/c/math.zig
@@ -399,9 +399,14 @@ fn rintl(x: c_longdouble) callconv(.c) c_longdouble {
     const result = @trunc(rint_func(x));
     const cast: f32 = @floatCast(result);
     const masked: c_int = @bitCast(cast);
+    std.debug.print("Should be '{}', is {}\n", .{ std.c.FE_INEXACT, std.c.fetestexcept(std.c.FE_INEXACT) });
     if (e == 0) {
+        std.debug.print("Inside If: Should be '{}', is {}\n", .{ std.c.FE_INEXACT, std.c.fetestexcept(std.c.FE_INEXACT) });
         _ = std.c.feclearexcept(std.c.FE_INEXACT | (masked & std.c.FE_INEXACT));
     }
+    std.debug.print("AFTER IF: Should be '{}', is {}\n", .{ 0, std.c.fetestexcept(std.c.FE_INEXACT) });
     return result;
 }

-------

~/repo/zig$ qemu-riscv64 ./ntest-release-small
Should be '1', is 0
Inside If: Should be '1', is 0
AFTER IF: Should be '0', is 1

and of course in debug mode without the instructions re-ordering the program’s behaviour is correct

~/repo/zig$ qemu-riscv64 ./ntest-debug
Should be '1', is 1
Inside If: Should be '1', is 1
AFTER IF: Should be '0', is 0

I can also confirm that using std.mem.doNotOptimizeAway does prevent the instructions from being re-ordered. I can’t change the signature of the feclearexcept function, as this needs to be exactly equal to the libc function but keeping it in the if block prevents the re-ordering.

~/repo/zig$ git diff lib/c/math.zig
diff --git a/lib/c/math.zig b/lib/c/math.zig
index efbe31c1c6..9078c326ae 100644
--- a/lib/c/math.zig
+++ b/lib/c/math.zig
@@ -400,6 +400,7 @@ fn nearbyintGeneric(comptime T: type, rint_func: fn (T) callconv(.c) T, x: T) T
     const e = std.c.fetestexcept(std.c.FE_INEXACT);
     const result = @trunc(rint_func(x));
     if (e == 0) {
+        _ = std.mem.doNotOptimizeAway(result);
         _ = std.c.feclearexcept(std.c.FE_INEXACT);
     }
     return result;

------

~/repo/zig$ qemu-riscv64 ./ntest-release-small
Should be '1', is 1
Inside If: Should be '1', is 1
AFTER IF: Should be '1', is 0
1 Like