Weird compiler (15.1) behavior

For one of my program, the compiler produces perfectly working code with -O ReleaseFast and -O ReleaseSafe, but produces code that SIGSEGV when using either no option or -O Debug (!)
When compiled with Debug and started with lldb, the error is quite amazing:
stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x7ffffeffeeb0)
The address seems to be the start of a perfectly ordinary function.
Anyone has an idea?

I have never seen that before. Anyone?

1 Like

Why have nothing to work with without your code to see if that is expected or not.

1 Like

There is nothing special in the code. It is the call of a function. There is a debug print just before the call, a debug print just after the start of the function, the first appears, the second never appears. I was wondering if there was a difference in backend generating code or in optimizations in 15.1 (the code was OK in 14.1)

stderr.print("calling compute_sym_alt\n", .{});
compute_sym_alt();

…
fn compute_sym_alt() void {
stderr.print(“sym_alt\n”, .{});
…

In 0.15 Zig now uses the self-hosted x64 backend by default. Given that this is still fairly new there may be some problems with it. To work around it you may be able to use the flag -fllvm to tell it to use the old llvm backend.

If that is the problem then it would be helpful if you could post a minimal executable that breaks, and create a bug report on github so that the compiler developers can take a look at it.

Genrally in order to help you it would be good to have a full program, otherwise we can only guess what really went wrong, e.g. here the bug may also lie in the code that constructs stderr or in a completely separate part of the program. And it’s also difficult to try your code if I have to first reimplement half of it myself, compared to just copy-pasting a working program.

6 Likes

It works with -fllvm.
I am going to try to find a minimal part that still breaks. The whole program is quite large.

1 Like

That’s a minimal code that breaks:

const std = @import("std");

const NB_SYMS: usize = 8;
const NB_DIMS: usize = 4;
const NB_ELEMS: usize = 1 << 16;
var hashesv: [NB_SYMS][NB_DIMS][NB_ELEMS]u64 = undefined;

fn compute_sym_alt() void {
    std.debug.print("three\n", .{});
    for (0..NB_ELEMS) |n| {
        for (1..NB_SYMS) |k| {
            hashesv[k][0][n] = hashesv[0][0][n];
        }
    }
}

pub fn main() !void {
    std.debug.print("one\n", .{});
    for (0..NB_DIMS) |i| {
        for (0..NB_ELEMS) |j| hashesv[0][i][j] = 0;
    }
    std.debug.print("two\n", .{});
    compute_sym_alt();
}

Edit: no it actually segfaults with the new backend.

one
two
Segmentation fault (core dumped)

I think the problem you’re having is that you try to debug with lldb something that wasn’t compiled with LLVM.

This is the formatted code

const std = @import("std");

const NB_SYMS: usize = 8;
const NB_DIMS: usize = 4;
const NB_ELEMS: usize = 1 << 16;
var hashesv: [NB_SYMS][NB_DIMS][NB_ELEMS]u64 = undefined;

fn compute_sym_alt() void {
    std.debug.print("three\n", .{});
    for (0..NB_ELEMS) |n| {
        for (1..NB_SYMS) |k| {
            hashesv[k][0][n] = hashesv[0][0][n];
        }
    }
}

pub fn main() !void {
    std.debug.print("one\n", .{});
    for (0..NB_DIMS) |i| {
        for (0..NB_ELEMS) |j| hashesv[0][i][j] = 0;
    }
    std.debug.print("two\n", .{});
    compute_sym_alt();
}

but it compiles and runs with no errors with the new backend.

I know what this is: Zig sometimes copies the entire array onto the stack when accessing a single element (see Another array access performance issue. · Issue #13938 · ziglang/zig · GitHub). Apart from performance problems this can also cause a stack overflow when trying to copy large enough arrays. And currently Zig doesn’t handle stack overflows correctly, leading to a segmentation fault without any stack trace (see Panic on stack overflow/segfault in debug mode. · Issue #7371 · ziglang/zig · GitHub)

Luckily for you the issue was just fixed a few days ago and if you try the latest zig-0.16-dev release it should work. If you can’t afford to switch then I guess you can just stick to the llvm backend until 0.16.0 releases.

9 Likes

I switched to 0.16.0 and it is perfectly OK. Thanks!

1 Like