Tracking down a segfault

My program is segfaulting, and so far I failed to understand the cause. I’m sure it is silly and I’ll feel stupid as soon as I find the answer, but here I am - that’s what you get for using zig as a gateway drug to low-level programming! :slight_smile:

My question is not specific to my problem: how do you folks track down a segfault in a zig program ? What tools do you usually use to do so? Is it different when your code calls to C functions? Do you have any advice for newcomers who rarely had to battle with this?


Just for the record, my segfault is here - calling jack_client_open raises this rather cryptic segfault.

Segmentation fault at address 0x0
???:?:?: 0x0 in ??? (???)

Haven’t done it myself but I recall others in the forum mentioning that you can use GDB with the debug binary just like any C debug build.

Address 0x0 is special, it means that the pointer was NULL.
Since in zig you cannot read or write to a null pointer, it means that the value came from C or it was not initialized.


If you have a core file you can use a debugger (gdb or lldb) to view the contents of the stack and the variables at the time of the crash.

I don’t have a core file - do I have to instruct my program to generate a core dump in case of failure ?

Julia Evans article on how to get a core dump for a segfault on Linux is excellent; she recommends to use valgrind.

Edit: It is not always necessary to get a code dump, if you can reproduce the bug you can just start the program in the debugger. Beej’s Quick Guide to GDB is a nice tutorial for gdb.

4 Likes

Using a debugger is the way to go (which sadly is often greatly underappreciated), it not only helps you find, understand and fix these kinds of problems, it is also great to just step through working code every once in a while, to get a more detailed perspective of what is happening in your program and confirm if it really does what you think it does.

I mostly used lldb both via its cli version and the one that’s available via the CodeLLDB extension, it is not perfect but it works well enough until there is something more zig focused.
One tip for example is to set a watch on a variable and then put ,b after its name to have the value formatted as a bit string, useful when working with bitsets. There are also other format options, but you have to look those up.

I think the CodeLLDB even can be scripted in python, so it may make sense to create some community scripts to make same standard data structures etc. easier to inspect in the debugger, if anyone has the time / interest to dig into the details. So far I only needed the basics so I haven’t spent time on that.

2 Likes

The advice others have given about using a debugger is good, and will help in most cases. In this case, however, the backtrace is not very helpful:

* thread #1, name = 'noize', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x0)
    frame #0: 0x0000000000000000
error: memory read failed for 0x0

The cause of this obscure error is that you’re not linking libc; when I add exe.linkLibC(); to your build.zig, the resulting program works fine for me.

Related issue: zig build-exe not warning for missing -lc parameter · Issue #10410 · ziglang/zig · GitHub

7 Likes

Thanks a lot for the advices ! I would have lost a lot more time finding the missing libc problem …

1 Like