Panic : reached unreachable code. Zig build --watch

I know this is new but I’m on WSL2 and after using zigup to install the lastest version of the compiler to test ‘–watch’ i’ve discovered that I cannot use it, is this because I’m using WSL2 ?

❯ zig build --watch
thread 4688 panic: reached unreachable code
/home/pollivie/zig/0.14.0-dev.296+bd7b2cc4b/files/lib/std/posix.zig:4508:19: 0x10eadf7 in fanotify_init (build)
        .INVAL => unreachable,
                  ^
/home/pollivie/zig/0.14.0-dev.296+bd7b2cc4b/files/lib/std/Build/Watch.zig:246:55: 0x10eab82 in init (build)
            const fan_fd = try std.posix.fanotify_init(.{
                                                      ^
/home/pollivie/zig/0.14.0-dev.296+bd7b2cc4b/files/lib/compiler/build_runner.zig:375:38: 0x11030f3 in main (build)
    var w = if (watch) try Watch.init() else undefined;
                                     ^
/home/pollivie/zig/0.14.0-dev.296+bd7b2cc4b/files/lib/std/start.zig:515:37: 0x10de225 in posixCallMainAndExit (build)
            const result = root.main() catch |err| {
                                    ^
/home/pollivie/zig/0.14.0-dev.296+bd7b2cc4b/files/lib/std/start.zig:258:5: 0x10ddd41 in _start (build)
    asm volatile (switch (native_arch) {
    ^
???:?:?: 0x9 in ??? (???)
Unwind information for `???:0x9` was not available, trace may be incomplete

error: the following build command crashed:
/home/pollivie/workspace/zig/zlib/.zig-cache/o/3834c987d66e3c28699a494ab242511c/build /home/pollivie/zig/0.14.0-dev.296+bd7b2cc4b/files/zig /home/pollivie/zig/0.14.0-dev.296+bd7b2cc4b/files/lib /home/pollivie/workspace/zig/zlib /home/pollivie/workspace/zig/zlib/.zig-cache /home/pollivie/.cache/zig --seed 0x46ccd56b -Z3af7a1a268123ca5 --watch

What kernel version does WSL2 provide?

1 Like

I use the latest ubuntu LTS distro and uname -r displays : 5.15.146.1-microsoft-standard-WSL2

After looking on the internet it seems like maybe that may come from an incomplete implementation of fanotify on the windows side of thing: WSL2 - fanotify related issues from 2021

My suggestion would be to wait for the Windows implementation of --watch and then run your build in Windows natively instead of WSL2.

IMO, it’s better to use Real Windows, or Real Linux, rather than Cheap Knockoff Linux inside Real Windows.

3 Likes

Yes even after updating to 5.15.153.1-microsoft-standard-WSL2 it still doesn’t work, but does work just fine on my 2nd partition with NixOS, btw thanks it’s a very practical addition.

I’m getting the same error on native Linux Mint. My kernel version is 5.15.0-102.

It looks like zig depends on a couple of flags like FAN_REPORT_TARGET_FID which are not available in that kernel verion. It would be nice to know the minimal kernel version needed for --watch.

I’m going to keep banging on a certain drum here: std.posix needs to stop pretending that certain error messages are unreachable. An errno comes from the system/kernel, it isn’t under the program or standard library’s control, so by construction, it is not unreachable.

Look at the stack trace! It was reached!

I maintain that the best policy is to return whatever errno enum is seen, and let the program decide what is and isn’t unreachable or crash-worthy. But if std.posix needs to crash on certain errors for some reason (dubious!), it should @panic. That unreachable in the stack trace is undefined behavior in ReleaseFast and ReleaseSafe modes. No bueno.

1 Like

open a bug

1 Like

There are already many bugs on the issue tracker.

This is the main one, which I have commented upon already.

This behavior it isn’t really a bug, std.posix didn’t end up this way by accident, it’s more of a design flaw. This is as good a place as any to point out that changing the status quo behavior is worth carefully considering.

3 Likes

I think the problem here is the dicotomy between the standard library as support code for the toolchain, and the standard library as a general purpose library.
When writing your own program, you want to encode assumptions in it, so the compiler can generate better code. When writing a library for other people to use, there are fewer assumptions that you can make.
Since the primary purpose of the standard library is to support the toolchain, they encoded the assumption that it would be used in a certain way.
In this issue that you linked, it is argued that:

The Linux API states that calling kill with a negative pid is valid.

If pid is less than -1, then sig is sent to every process in the process group whose ID is -pid.

However, if the Zig toolchain does not intend on using it in this way, then it makes sense to encode this assumption in the code and get some extra performance.
If you want to make a good case for this, what you can do is find one instance in the Zig source code where the toolchain uses this function, and craft a payload that triggers the undefined behavior.
For instance, in the issue linked, the author argued the following:

It’s possible for that misbehaving child process to signal it’s parent to reap the misbehaving children, (grandchildren to the correct program). When the correctly behaving program attempts to clean up it’s grandchildren, if the poorly behaving program has done so already, or something else happened in between. The process group may no longer exist, and Linux will respond with.

ESRCH The target process or process group does not exist. Note that an existing process might be a zombie, a process that has terminated execution, but has not yet been waited for.

This is an unexpected state for the parent process, but it’s not programmer error for the application that’s going to hit UB, or panic.

Lets suppose that some part of the Zig build system spawns a process, and, if the process misbehaves, it calls kill on it. You can craft a misbehaving program and make the build system call it. If the argument posed by the issue author holds, then this will trigger the UB in the build code. Therefore, you’ll be able to argue that this assumption that they encoded is causing a bug in the Zig toolchain itself, not in other people’s code. This will be a very compelling argument that Zig should fix it. For this test, you probably want to use a debug build of the compiler, to properly trigger a panic.

2 Likes