Well formulated thoughts. ![]()
A good read.
Generally my thought on assertions are: applications requirement differs vastly and userland assertions are just tool you can opt in or out.
In high availability (say streaming) apps drooping few packet is fine but in a high integrity(say dbs/trade) enviroment ignoring corrupted bits could mean broken logic/invalid state or worst things imaginable.
some applications requires the best effort on all fronts/needing more qualities; where you cannot compromise any main qualities: may be good to fail-fast on assertions to guarantee integrity, and while fail-over to a backup to guarantee availabilityâŚ
The need to have all good things is even more high if a software is deemed foundational; so its internals must be written by the finest of us and needs to look and feel flawless as much as possible. But it is also true that is not a human trait; right? kernels panics or a systems taken hostage / power outageâŚ
All this is to say, users get to choose what they think is right.
I have one question for my particular use case, which is a database process (not a library) in which ReleaseSafe will always be used in production. ReleaseFast and ReleaseSmall will never be used.
For my use case I believe there is no significant behavior between assert(condition) and if (!condition) @panic("assertion failed").
Is that correct? Iâm asking just to confirm my understanding. It seems obvious now that I ask it, but I would still like to double check.
(There are a small number of places where I call @setRuntimeSafety(false) and use unreachable in those blocks where Iâm extremely confident that they cannot be reached, but those can be disregarded for this question.)
I believe so. The panic might follow a slightly different code path when hit than the unreachable check, but for your use case I doubt itâs significant.
Itâs basically correct. The only difference I can see that there is one more indirection for the unreachable case: godbolt. Also note that there is std.debug.panic which is able to formatting.
An interesting case that doesnât 100% agree is Sqlite, which does disable asserts for a release build. (C, not Zig, but a very robust codebase.)
Although, their practice is subtly different. They use separate macros for asserts that they canât prove - ALWAYS and NEVER. These macros crash in debug builds, but act as a boolean passthrough in release, so there is always explicit code for correctly handling when assumptions are violated in production rather than just continuing and potentially causing issues.
Also worth noting is that leaving the actual asserts on in sqlite makes it run 3x slower, which is part of why they disable them. But Loris is very critical of this:
If the cost of program misbehavior is so high that you donât want to risk it, then you should keep the asserts on, and if performance is so important that youâre willing to risk misbehavior, then youâre just leaving performance on the table, while thinking that youâre safer than you really are.
Iâm not sure what side of the fence Iâm on with this.
Worth a read for any curious: The Use Of assert() In SQLite
Good read. I remember the posts between Kristoff and Madklad of the topic, and it was the first time I heard about weird machines.
That conversation and this article keeps reminding me of Ada, and how they have constraints that hold assertions true at compile time. I remember reading Ada code that prevented having the problem of a binary search tree of a large array, just because overflow was caught at compile time, something which many implementation (and text books) only noticed as a problem many years after the fact.
By my own argument, if they feel confident enough to disable them in prod, then yes I think it should be fair game to turn those disabled asserts into optimize-able unreachables. I would definitely consider doing this in my own projects that use sqlite, as I trust the sqlite developers to be extremely thorough with their coding practices.
As matklad mentioned in another thread, the sqlite people have nuanced goals that might justify turning asserts off, but I still think itâs a bad idea for âgenericâ software.
Iâll paste here part of a message I wrote elsewhere on this same topic:
You add assertions for a variety of reasons but regardless they have an inherent axiomatic nature for the compiler (ie if theyâre not trivially redundant, they are facts that the compiler cannot derive on its own), and so that fact can be exploited in a variety of ways:
- better debuggability,
- extra checks at runtime to lower the chance of program misbehavior,
- extra facts that the compiler can use for optimization,
For the second case, Zig chooses crashing as the default mechanism, but if you really want you can fuck around with the panic handlers and implement something similar to a recoverable panic like in rust/go (a simpler version, and maybe thatâs for the better, the go one in particular has gnarly footguns), but Zig aims at âperfect softwareâ so itâs not what you get out of the box.
I have never in my life written an assertion with the explicit intent of obtaining a performance increase, and if I ever will, it will be a different process that involves measuring. My expectation is that, when building in ReleaseFast, the compiler will try best-effort to make use of the extra axioms Iâm providing, as it already does with language-level asserts (oob, overflow, etc).
And the thing is that you can get all of these things out of asserts at the same time by switching build mode (and curating your asserts a little when necessary, like reserving expensive asserts for debug mode).
Another variant of the second case is to have the assert print a log line instead of crashing the program, which makes sense when crashing is worse than continuing, panic recovery is complex and more prone to errors, and a human will have access to those logs and is eventually going to get a chance to fix falsifiable asserts. This is another thing that Zig doesnât give you out of the box, but this one is especially trivial to implement.
Given all of this, turning off asserts in release builds to me seems the dumbest thing one could do, second only to deleting asserts from the codebase or straight up never writing them.
Turning asserts into log messages sounds a lot like disabling them, but itâs not the same thing, since full on disabling means that nothing will ever be able to notify you that the program deviated from the spec.
In a similar fashion to SQLite, I think Knot Resolver[1] takes an interesting approach. They have their own kr_assert and kr_fails_assert functions, which upon a failed assertion fork the program into a second process, and immediately abort that new process, generating a core dump for inspection with a debugger. The original instance then keeps on running with some best-effort graceful failure behaviour (and logs a notification about the failure).
EDIT: It is actually configurable â the default behaviour is as described above, but there is also a mode where an assertion failure simply kills Knot Resolver. Additionally, there is also a âkr_requireâ, which is for cases where graceful recovery is not possible/practical at all, which kills the whole daemon in all cases (and in all optimization modes â there is no way to disable this).
A DNS resolver daemon. Used to be my day job, but the assert mechanism predates me working there. âŠď¸
Yes, this is how it works! We do the same at TigerBeetle.
Though, I expect that someday weâll get accused of misusing asserts, because we do rely in many places on invalid external input crashing the process with an assert, which is exactly the behavior we want for our system, but not how one typically uses asserts.
On the other hand thatâs a nice lifehack for dissuading people from building ReleaseFast TigerBeetle, if you donât want them to (on top of the build.zig logic).
Iâm not familiar with TB internals, but for some distributed systems itâs a problem if an attacker can take down specific nodes at specific times to make consensus gaming more practical (such as partioning/censoring/etc) Is this something your simulation testing covers, for example?
We also @compileError on unsafe @import("builtin").mode in the code itself, so youâll have to patch source code as well.
Our fault model is that communication is trusted â if you talk to replica 3, that really is replica 3 (currently enforced operationally, outside of TigerBeetle proper).
mandatory list of side effects in Zig Key semantics of std.debug.assert - #9 by andrewrk
I noticed during my experiments (donât ask me for details) that:
do_something() catch unreachablecan produce slower code thando_something() catch @panic(...)- there is no real direct way to predict if an assert can optimize our code or we should protect it in releasemode by
if (is_releasefast) assert(...)
My codeâs state is currently still a wild mix of using assert or some other mechanism catching errors or just crash or just accepting UB.
That link was extremely helpful. Thank you ![]()
Iâm not quite sure what âdisabling assertions in a production environmentâ means. I speculate that their alternative approach is: rewriting all the places that would originally trigger assertions to report an error, which must be explicitly caught by the upper layer, and the upper layer decides whether to adopt a crash approach equivalent to the original assertion or to use another error handling method.
âsubonciouslyâ - thatâs a really cool word. It should be real. Shakespeare?