Key semantics of std.debug.assert

No, and is defined as short-circuiting (see the operator table in the langref) and the reason why it can do that is because it’s a keyword.

If you saw something like std.logic.and(x, y) then you would be able to know, trivially and with full confidence that and() would not be able to guarantee short-circuiting.

2 Likes

Understanding evaluation order of function calls is not about language internals, it’s the mental model that users of the language must build in order to know what they’re doing.

You mean an assert macro or an assert builtin, because functions can’t normally to these things also in other languages, not just Zig.

1 Like

This part is fine. But as Andrew said, expensive assert args can still be deleted, presumably through dead code elimination

Whether or not it happens is at least partly about side-effects, for which there exists a very concrete list that Andrew shared above.

But is the lack of side-effects supposed to guarantee that the (potentially expensive) expression will be eliminated, or just that it may be eliminated?

What’s you mental model for deciding when it’s fine to assert and be sure that it won’t affect performance negatively?

I’m not sure I see where the side-effect is in the example below, which causes the expensive assert to exist in ReleaseFast:

const std = @import("std");

// Simple bounded len to avoid depending on std.mem
fn len(val: [*:0]const u8) usize {
    for (val, 0..1000) |v, i| {
        if (v == 0) return i;
    }
    return 0;
}

fn checkSum(val: [*:0]const u8) bool {
    var count: usize = 0;
    var sum: usize = val[1];
    // This should take a while
    while (count < len(val) * 100_000_000) {
        count += 1;
        sum += 1;
    }

    return sum == 1300000047;
}

pub fn main() void {
    var args = std.process.args();
    // Just to get a only-runtime-known value
    const first = args.next().?;

    std.debug.assert(checkSum(first));
}
zig-v141 build-exe -OReleaseFast sideeffects.zig
./sideeffects

This takes several seconds under ReleaseFast

Is it due to inlining/optimizations so side-effects from std.process seeps into checkSum? Probably not, since @call(.never_inline,... yields the same result.

Or is it simply that there’s no guarantee (dead) code is deleted under ReleaseFast, even in the absence of side effects?

3 Likes

While it makes sense, I think my mental model had been that std.assert would only remove “simple” expressions (boolean comparisons), so it’s nice to know that it can remove more complex ones if there are no side-effects. I agree with @cryptocode that it would be nice to get a good idea of when that is guaranteed vs just possible.

You mean an assert macro or an assert builtin, because functions can’t normally to these things also in other languages, not just Zig.

Many languages make no visible distinction between builtin/macros and regular function calls. Take Python for example (which has already come up in this discussion). The only way to know that you are using a builtin in python is to memorize the list of builtin functions. So it isn’t surprising that the commonly mechanics of assert in other languages are not associated with the fact that assert is usually a macro or builtin.

EDIT:

I forgot that assert in python is a keyword, not a function call, so probably not the best example to use.

1 Like

Some might make it harder, but this example is factually wrong. assert is a statement in Python, that’s why you don’t have to put parens after it!

assert 2+2 == 4, "bad math"

EDIT

yep :^)

1 Like

That’s a fair question and your example does suggest that the language is not providing you with elision guarantees, at least at the moment.

One thing I do when I want to make sure that an expensive check never escapes debug builds is this (note that this means that I remove the check in ReleaseSafe):

In this example there’s a task progression value that is there just to catch programming errors which is not needed for correctness, so I just take everything out when not in a debug build.

Here I’m sorting some scanned files in order to make debug builds more deterministic, but this sorting is not needed for correctness, so I don’t do it in releases:

This is to say that there are without a doubt some safety checks that the compiler cannot reasonably elide in optimized builds so you will need to use this approach from time to time.

Going back to your example, that one seems that the optimizer should be able to figure it out and I don’t know why that doesn’t happen, but regardless I guess it’s fair to assume that there’s a gray area where you might want to code defensively.

Note that you don’t have to branch on the build mode necessarily, there’s also std.debug.runtime_safety (which separates Debug and ReleaseSafe from the rest).

4 Likes

I’ll put std.debug.runtime_safety to good use when necessary. Appreciate the detailed response.

1 Like

Tbf, in C it’s also a bit weird, e.g. just compiling this:

clang bla.c -O3 -o bla

…keeps the assert() active, you’ll have to do this to remove the assert:

clang bla.c -O3 -DNDEBUG bla

AFAIK most C/C++ build systems commonly define NDEBUG in ‘release build modes’ and that’s where the impression is coming from that assert() is an empty macro in ‘release mode’.

…and for the record, I’m totally fine with Zig’s assert() behaviour, but yeah, it probably should be mentioned in the docs, or ideally some ‘migrating from C/C++’ doc section :slight_smile:

1 Like

One thing to note about preprocessor asserts:
The expression inside the assert is gone completely, and with it the information inside of the assert.
In the worst case this can actually make your code slower, since the compiler has less knowledge about your code and thus less optimization opportunities.
Here is a simple (constructed) example of this: Compiler Explorer

5 Likes

The expression inside the assert is gone completely, and with it the information inside of the assert.

True, that’s also why C asserts usually also translate to static analyzer hints when building the code in analyzer mode (e.g. an assert(ptr != 0) will hint the static analyzer that ptr can’t be null which will silence a ton of false positive warnings) - and if a C assert would translate to __builtin_assume in release mode it would also serve as optimizer hint.

Which brings me back to the idea that I’d probably like different assert flavours… one for those heavy validation check (think 3D API validations) which will totally destroy performance when enabled), and ‘light-weight’ asserts which remain in the code in release mode and also serve as optimizer hints.

ho ho ho ho what is going on here?
isn’t assert in Zig just always thrown away in ReleaseFast mode??

I cannot believe my eyes… so we can optimize code with assertions???

unsigned int zigAssert(unsigned int x) {
    assert2(x < 16);
    return (x*15)/15;
}

unsigned int zigAssert2(unsigned int x) {
    return (x*15)/15;
}


zigAssert:
        mov     eax, edi
        ret
zigAssert2:
        mov     eax, edi
        mov     edx, 2290649225
        sal     eax, 4
        sub     eax, edi
        imul    rax, rdx
        shr     rax, 35
        ret
2 Likes

Even though I’m on par with the semantics of std.debug.assert since the beginning, I must confess that the example provided by @cryptocode kind of breaks my mental model… I use assert a lot and I’d love to understand why in this case checkSum is still computed. Just out of curiosity, does anyone have an explanation, or even just a hypothesis?

What seems really odd to me is this:

// main.zig

const std = @import("std");

fn assert(ok: bool) void {
    if (!ok) {
        unreachable;
    }
}

fn check(val: []u8) bool {
    var sum: usize = 0;
    for (0..val.len * 100_000_000_000) |v| {
        sum += val[v % val.len];
    }

    return sum == 1_234_567_890;
}

pub fn main() void {
    var prng: std.Random.DefaultPrng = .init(12);
    const rand = prng.random();

    var buf: [100]u8 = undefined;
    rand.bytes(&buf);

    assert(check(&buf));
}
zig run -OReleaseFast main.zig

In this case I observe the same behavior, check is computed.

But if instead I just call _ = check(&buf), then check is elided. I am tempted to deduce that check is recognized as having no side effects, and I see no reason why the optimizer would miss the fact that assert also has no side effect. Unless something unclear to me hides behind unreachable

I also tried just guarding unreachable in assert with:

fn assert(ok: bool) void {
    if (!ok) {
        if (std.debug.runtime_safety)
            unreachable;
    }
}

With the same results: check is elided in ReleaseFast mode.

I like the fact that unreachable serves as a potential hint for better optimizations, but now I need to be sure it doesn’t prevent optimizations from happening !

6 Likes

I suspect it’s a bug in Zig.

I agree that this particular case (the check thing) should be being optimized away, but that’s entirely on LLVM, not Zig.

I’m being asked to put something like this into the documentation of assert:

/// This is a function. This is not a preprocessor macro, since Zig doesn't
/// have macros nor a preprocessor. Expressions at the callsite used to call
/// this function are not evaluated lazily, since Zig doesn't have lazy
/// evaluation.

Why stop at assert? These comments apply to literally every function.

This is ridiculous. I’m not going to document the absence of language features just because people have PTSD from C.

It’s a missed optimization. You can file an LLVM bug report, you can wait until the Zig project starts to introduce optimization into the pipeline, you can add an explicit check for optimization mode, or you can simply do nothing, since it’s not a correctness issue, and the compiler will improve optimization over time.

You are implying that I am focused on “winning the ‘discussion’”, which is insulting. We cannot have a conversation in good faith under these conditions.

8 Likes

Zig will never guarantee that dead code will be eliminated, just like the Zig language specification does not guarantee any other specific optimization. That doesn’t mean you can’t rely on it – after all, we all rely on optimizations, even though most languages have zero guarantees about them!

Because dead code elimination works just like any other optimization, the way you should utilize it is no different than any other optimization. Broadly speaking, my advice is to assume that the optimizer will do the right thing. If you have a performance problem, and discover that the optimizer has failed to make optimal choices, then go ahead and add an explicit check (with a comment documenting the reason for its existence).

13 Likes

makes sense, to be fair Andrew mentioned that if an operation has side-effects, then it’s guaranteed to stay but that doesn’t imply the opposite (that if it does not, it’s guaranteed to be removed)

1 Like

This discussion reminds me of the issues that newcomers have with “packed”, where people bring preconceptions about what it means based on their previous language experience, and run into troubles because “packed” in Zig means something quite different than it does in other languages (related concept but quite different meaning).

This comes up over and over. Zig packed != C/C++ packed. Zig assert != (other lang) assert. Zig for != (other language) for. The list goes on, whether that’s a language feature/keyword, or a std function.

Zig can’t always avoid this effect by choosing distinct names. In every case, there are high-level concepts that are in common. An assertion is a declaration of a constraint or assumption. Beyond that, the details will always be language specific.

1 Like

Yeah this thread has been clarifying, namely how missed elision optimizations is a thing that you must keep in mind anyway, that side effects prevent them from happening (the original question), and that we have other tools at our disposal that we can combine with assert etc (std.debug.runtime_safety and friends)

Thanks everyone

7 Likes