How to recognize and fix PRO segfaults?

In my project, I have a bunch of functions that take in a Type tagged union. However, I am now getting a segmentation fault when reading this Type parameter:

Segmentation fault at address 0x7547ece4700c
/home/sno2/projects/xlang/src/Smith.zig:154:13: 0x1021918 in genExpression__anon_2699 (smith)
    switch (target_type) {
            ^
/home/sno2/projects/xlang/src/Smith.zig:132:36: 0x102185a in genExpression__anon_2699 (smith)
            try smith.genExpression(target_type, writer);
                                   ^
/home/sno2/projects/xlang/src/Smith.zig:132:36: 0x102185a in genExpression__anon_2699 (smith)
            try smith.genExpression(target_type, writer);
                                   ^
/home/sno2/projects/xlang/src/Smith.zig:240:36: 0x1021eea in genExpression__anon_2699 (smith)
            try smith.genExpression(smith.type_extras.items[pair_i + 1], writer);
                                   ^
/home/sno2/projects/xlang/src/Smith.zig:238:36: 0x1021e6d in genExpression__anon_2699 (smith)
            try smith.genExpression(smith.type_extras.items[pair_i], writer);
                                   ^
/home/sno2/projects/xlang/src/Smith.zig:265:32: 0x1023089 in main (smith)
        try smith.genExpression(target_type, source.writer(smith.gpa));
                               ^
/home/sno2/projects/zig/lib/std/start.zig:617:37: 0x10208b1 in posixCallMainAndExit (smith)
            const result = root.main() catch |err| {
                                    ^
/home/sno2/projects/zig/lib/std/start.zig:248:5: 0x10204ed in _start (smith)
    asm volatile (switch (native_arch) {
    ^

Now, the segfault stack traces that I get always include some instance of calling my functions with an index into smith.type_extras.items. I believe that PRO is making my Type parameter a reference, then I am modifying smith.type_extras, and the PRO-optimized reference is now pointing to freed memory. Is there any way to work around this?

After rewriting all calls to store intermediate Type values in variables above IO operations, it no longer segfaults. Here is an example of one such conversion:

            try writer.writeAll("(cons ");
            try smith.genExpression(smith.type_extras.items[pair_i], writer);
            try writer.writeByte(' ');
            try smith.genExpression(smith.type_extras.items[pair_i + 1], writer);
            try writer.writeByte(')');

into

            const left_type, const right_type = smith.type_extras.items[pair_i..][0..2].*;
            try writer.writeAll("(cons ");
            try smith.genExpression(left_type, writer);
            try writer.writeByte(' ');
            try smith.genExpression(right_type, writer);
            try writer.writeByte(')');
1 Like

I would also expect this kind of error if you hold on to the items slice somewhere, add new elements to ArrayList (causing it to reallocate the items somewhere else) and then continue to use the old items slice that is stored somewhere else. Could it be something like that instead?

I should have put this in the question itself, but I used indexes into type_extras itself to avoid managing pointers:

pub const Type = union(enum) {
    unit,
    num,
    bool,
    string,
    /// extras[function=arg_count, arg0, arg1, ..., argn, return]
    function: ExtraIndex,
    /// extras[a]
    ref: ExtraIndex,
    /// extras[a, b]
    pair: ExtraIndex,
    /// extras[a]
    list: ExtraIndex,

    const ExtraIndex = u32;

    pub const Tag = std.meta.Tag(Type);
};
1 Like

Here’s a trick that works in any language you can debug with GDB:

Use rr to run your application until the segfault happens. Then put a hardware watchpoint on the bad memory address. Then reverse-cont to run the program backwards until that memory changes. This typically will take you to exactly the point where the memory became corrupted, and you can obtain a stack trace as well as poke around in the debugger.

3 Likes

Thank you for the advice, Andrew. I will get this setup on my machine.

Here’s an example of me using rr, maybe it will help.

2 Likes