Do we need an @assert?

I misread/confused by others.

Reading it again, std.debug.assert should do everything you want it to. Rather, if (blah blah) unreachable does what you want, which is what assert does.

that is not what ā€˜side effects’ are in compiler/programming language terms, side effects are effects that can be seen beyond the function call, such as writing to a pointer passed to the function.

optimisations are not side effects

1 Like

Hence why I’m suggesting to settle on what would be the behavior of a hypothetical @assert. And illustrate with short examples.

Edit: I didn’t know people meant ā€œside effectsā€ as affecting the state of the program. My understanding was that it means what the compiler can’t see beyond. Syscalls, accessing volatile pointers and stuff.

I think of a side-effect as ANY change in program state; the term seems a bit of a misnomer, writ large - maybe ā€œanyeffectā€ would be a better term. With respect to the notion of assert(), it distinguishes between a ā€œmere assertā€ (ALL it’s doing is checking the truth value) and an assert that contains code that happens to alter program state in any way. The latter seems worth avoiding like the plague, imo, especially IF the proposal is for target-based magic (e.g., ā€œassert only executes in debugā€ motif).

Amen and amen.


EDIT: sorry for my shortsightedness here; I see that there’s a well-defined exhaustive list of all actual ā€œside effectsā€ pertinent here. ā€œdebug codeā€ that, e.g., increments an analytics counter is surely reasonable, and I suppose I’d normally encapsulate expensive activity within if (constants.verify) to protect a hotpath. I don’t ultimately see a need for a macro-ish assert(), but I could be missing some points. I know I need to go wrap myself more completely around unreachable, as others have done, before I can add value. But, at my level of ignorance, my ā€œleanā€ is very much with the zig status-quo of assert() only ever being a plain function, regardless of the heritage that the term ā€˜assert’ has in other languages. I also agree, therefore, that it seems assert() could live in std rather than std.debug, at least to help avoid the tendency newcomers will have, to think of it in terms of that heritage.

1 Like

Problem here is the definition of side effects. It could be as innocent as increasing an analytics counter. And suddenly all measured apps have side effects.

But there is no way around it without hidden control flows or the ability to mark an actual side effect as purely analytics measurements or inconsequential.

Side effects are well defined in Zig: Key semantics of std.debug.assert - #9 by andrewrk

2 Likes

Thanks, I’m not crazy then.

I suggest we use another term, like ā€œmutating behaviorā€ or something. Because I think side effects are actually quite relevant to the topic as well.

I wrote some guard routine for debug and releasesafe modes.
Say goodbye to (@)assert optimizations :frowning:

        if (comptime lib.is_guarded) {
            lib.guard(input_depth >= 0, "search inputdepth < 0", .{});
            lib.guard(input_beta > input_alpha, "search wrong alpha beta", .{});
            lib.guard(is_root or pos.ply > 0, "search wrong is_root vs ply", .{});
            lib.guard(!is_root or input_depth > 0, "search wrong is_root vs input_depth", .{});
            lib.guard(!is_root or is_root == is_pvs, "search wrong is_root vs is_pvs", .{});
            lib.guard(is_pvs or input_beta - input_alpha == 1, "search wrong is_pvs vs alphabeta", .{});
        }

I’ve enjoyed what I’ve gleaned from this thread and this thread and this article and this article, all referenced above (or hereabouts). One question that emerges is: would (simply) this be advisable (in our lib/application code, that is):

if (many < 2) unreachable;

? That is… is there anything wrong with promoting the use of unreachable directly, in the wild, potentially in place of calls to assert() (and the baggage of the term ā€˜assert’) altogether? Coupled with a true understanding of what unreachable does (and doesn’t do), one could perhaps embrace it for its own merits, and be … perhaps even ā€œmoreā€ aware of side-effect and no-side-effect scenarios… and even cleaner coding and commenting around them? I know that I’m one who developed an allergy for putting anything side-effect-ish ā€œinsideā€ of macro asserts (C, e.g.); with direct unreachable grace, I might write code that is particularly obvious to readers of all sorts, even those with preconceived notions of assert(in other languages) - they won’t jump to the conclusion that some bit of code will be elided, and, if they bother to learn about unreachable, they might realize that some bits of code might be optimized when appropriate for the compiler. More advanced users can be pretty certain that bits of code will never be optimized, as long as the code contains side-effect code such as that which stores or loads through a volatile pointer, or which intentionally contains @breakpoint, e.g.

5 Likes

It’s funny people didn’t seem to focus on this suggestion as, in my opinion, it covers most of the issue.

The issue to me isn’t assert() which is just syntactic sugar for if (booltest) unreachable That’s just fine. I like the fact that unreachable can be used to hoist information into the compiler and a disappearing assert() would undermine that.

The issue for me is that I have no obvious way to flag/test if a condition does or does not have side effects. If I were asserting on hashes, someone could change a hash function to cache values instead of recompute and suddenly my nice side-effect free assert() now doesn’t get optimized away. Being able to assert(@comptimeErrorIfSideEffectsElseValue(booltest)) is something useful. The compiler has to do that check anyway to avoid incorrectly eliding the assert()–this would just expose that logic as a builtin.

Musing: I wonder how many specific compiler optimizations could be removed in deference to more general ones if instead we relied on code using ā€œunreachableā€ a lot more. :thinking:

2 Likes

Yes, but isn’t that check done at the backend/LLVM layer?

Using unreachable directly is something I advocate for.

But the argument against that is the same: assert baggage from other languages. People are used to assert, they know what it means, they are less/not familiar with unreachable.

You already said the other side of that argument, adding to it: they are already learning zig, unreachable is a small and rather simple part of it. It is also easier to learn a new concept than it is to modify an existing one.

There are two orthogonal issues:

  1. The unimportant one is that assert() semantics differ from other languages. My take … tough noogies. This is Zig–deal with it or go use a different language.

  2. The important problem is that a test which calls unreachable can silently and invisibly switch to having side effects due to very minor source code changes a long way away from the unreachable test. This probably won’t cause ā€œcorrectnessā€ bugs, but your performance may suddenly go into the dumpster with no local indication of why. That appears, to me at least, to strongly violate the ā€œZen of Zigā€.

Normally, I would just chalk this up as something rare, but this already exists in std.Io function getOne (Edit: Finally got Codeberg link to work: Codeberg link to getOne ) in putOne() and getOne(), and it’s the man, himself, who wrote that code. I strongly suspect that I have more than enough levers to make that assert swing from super fast to super slow by jiggling code a long way away. Or, maybe Andrew doesn’t expect that statement to be side-effect free, but I can’t obviously tell that from the code.

Creating the moral equivalent of @hasNoSideEffects() (or @hasSideEffects() I don’t really care what the name is) allows you to guard against that inside code that you own and get a local message when something far away hoses you. It also allows the person who wrote a library assert or unreachable to tell you what they expect.

2 Likes

This doesn’t seem to be a response to what I said.

But I am all for some way to programmatically detect side effects.
The issue is (if you care about it) that it doesn’t change assert, it is a function that just takes a bool argument, it already has the result so it will never be aware of side effects.

Though I am in favour of that, as it prevents hidden control flow.

This problem will never go away, unless you introduce function colouring for side effects.

But programmatically, checking for side effects, allows those who care to assert or react to it at the very least.

My mental model is this:

  • C asserts are a request for something to be checked
    • ā€œPlease assert that this expression is true.ā€
  • Zig asserts are a promise
    • ā€œI, the programmer, assert that this expression is true.ā€
    • ā€œā€¦and feel free to call me out for lying.ā€

This makes Zig asserts informational to the compiler. It can assume something is true because the programmer told it so. If you litter your Zig programs with assert, you give the compiler as much information as you can to optimise with. Get out of the mind-set that assert is for catching bugs. It may do that too if you’re in the right build mode, but it’s not the primary purpose.

4 Likes

Here’s my take:

In Zig, asserts are a way of telling the compiler: ā€˜if this value is false, my program is in an invalid state’.

The compiler uses this information to produce a more optimal program. But what is it optimizing?

Safety? → spend as little time as possible in invalid program state → compute the value and panic if it’s false.

Speed? → be as fast as possible while the program is in a valid state → use the logical content of the assertion to do less work, on the assumption that it’s true.

It’s always about optimizing something, it’s just a question of what.

5 Likes

I’m completely on board, and think this deserves attention. Indeed, if @hasNoSideEffects() can be reliably determined at compile time, that seems great. I also struggle, though, to imagine frequently finding myself in this situation. That may be a problem with my imagination, indeed (as you cite an example in std!), but it seems a tendency to defensive coding will go a little ways toward minimizing the chance. Honestly, I’ve given my imagination more to the task of if statements and assert-like uses; I haven’t fully considered how likely a surprise will lurk with an unreachable as the else=> in a switch (or switch-loop!?) or other more clever (and probably excellent) uses. But it seems that if I’m rarely making a call that might go stack-deep IN the decision that locally branches toward unreachable (or not), then I’m rarely going to have trouble. Right? Most people’s min < max types of checks are quite safe from down-the-road surprises, aren’t they?

Lets say there WAS a @hasNoSideEffects that worked beautifully, when would I use it? Perhaps never in the case where I don’t have function calls in the decision locus. But, when I do, perhaps I’d always guard with this @? Just to be on the safe side?

3 Likes

I agree with you, and the reason why I cited the stuff it std is specifically because I’m not particularly fond of that style, but it looks like the Zig developers are.

Having an assert() that has side effects actually bothers me a lot (and has resulted in hard to track down bugs in other languages), and I would almost always shield them with @hasNoSideEffects() if I had it. I also find that something inside an assert is almost always something I’m eventually going to have to debug, so I don’t want multi-stage code inside an assert anyway (I have similar complaints about ā€œchainingā€ styles of connected object function calls in other languages as they are hard to break apart to debug). Of course, like you said, it may simply be a failure of my imagination.

However, if you look at the ghostty codebase you have the counterexample to the complex assert style–it only (as far as I can tell) uses assert() to enforce side-effect free invariants.

1 Like

Indeed, and, at the risk of beating a dead horse, can anyone help me make sure I’ve even got the scenarios covered?

There’s the obvious one: IF there are side-effects identified, then the code will NOT be optimized away. But the converse is not guaranteed, right? Code withOUT side effects might still NOT be optimized away, true? So there would be (no-side-effect) cases in which code gets elided, and others in which it doesn’t. Perhaps, normally, we don’t care, because perhaps the compiler knows best how to do its job. But it’s possible, then, right, to have expensive code that you’re pretty sure should be optimized in ReleaseFast, but…?

Criticize this snippet, please. I don’t think any of the documented side-effects are here:

pub fn add1(x: *u8) *u8 {
   x.* += 1;
   return x;
}
pub fn unreach() !void {
   var a: u8 = 1;
   const z: u8 = 2;
   if (add1(&a).* > z) unreachable;
   if (add1(&a).* > z) unreachable; // will fail if previous add1() call was not elided
}

This fails not only in debug modes, where you’d expect, but in ReleaseFast, as well, indicating that there was no need, perhaps, to optimize add1() away. But when I create an addlots() function that also merely increments x.*, but does so in a million-iteration loop, making the return value giant, and then try again, the same thing happens: neither of the unreachables results in removal of that technically side-effect-less code. I’m using unreachable, here, but especially if one had an uninformed idea of what assert() does, they might expect it to not run all that code within, when built ReleaseFast. If I haven’t missed something here, @hasNoSideEffects() might not always be helpful. It’s worth re-endorsing if (constants.verify)-guarding expensive blocks, and, sure, one could put an if-check (e.g., for a debug mode) IN an assert function (and perhaps some would expect that to be in the std.debug.assert(), since it’s in std.debug), but it might simply be bad form to expect anything specific to happen around unreachable itself, except to expect that side-effect code will NOT be optimized away?

Thanks for helping me button up thoroughly.

First of all, I don’t think this feature is worth adding a builtin function. We can easily implement it in the user space.

There is also PR in the standard library that implement this function, although rejected. However, similar code can be easily added to one’s own module.

The problem with adding this function to the standard library is that it actually detects the compilation mode of the standard library rather than that of the current module. But if you copy this function to your own module for use, there won’t be this problem.

1 Like