Stack probe puzzle

Once upon a midnight dreary, while I pondered, weak and weary

Three days ago, I came across this nice little rabbit hole. I still haven’t reached the bottom. It is getting lonely in here, so I might as well start a diary.

The story begins on https://devhub.tigerbeetle.com, with a little red line:

dcd4339 (main)    lsm_forest    ./zig/zig build -Drelease fuzz -- lsm_forest 11137203625256914804    4s 15h 19m 21s ago    1

Look, a fuzz failure in one of TigerBeetle’s minor fuzzers! Convenintly, TigerBeetle is fully deterministic, so reproducing that is easy as

$ git clone https://github.com/tigerbeetle/tigerbeetle && cd tigerbeetle
$ git switch --detach dcd4339
$ ./zig/download.sh
$ ./zig/zig build -Drelease fuzz -- lsm_forest 11137203625256914804
info(fuzz): Fuzz seed = 11137203625256914804
info(lsm_forest_fuzz): fuzz_op_count = 778280
info(lsm_forest_fuzz): action_weights = enums.EnumFieldStruct(@typeInfo(lsm.forest_fuzz.FuzzOpAction).Union.tag_type.?,u64,null){ .compact = 1, .put_account = 8, .get_account = 4, .exists_account = 0, .scan_account = 0 }
info(lsm_forest_fuzz): modifier_weights = enums.EnumFieldStruct(@typeInfo(lsm.forest_fuzz.FuzzOpModifier).Union.tag_type.?,u64,null){ .normal = 100, .crash_after_ticks = 1 }
info(lsm_forest_fuzz): puts_since_compact_max = 30
info(lsm_forest_fuzz): Passed!
info(fuzz): done in 6.795s
$

Huh? It didn’t fail? This can’t be! Could it be that I am running this on MacOS, but the problem is Linux specific? Let me try remote runnning it on my Linux box:

$ rr ./zig/zig build -Drelease fuzz -- lsm_forest 11137203625256914804
...
info(fuzz): done in 16.261s

Hm, maybe it is somehow non-deterministic after all? Shall we repeate the exercise some more?

$ rr n 128 ./zig/zig build -Drelease fuzz -- lsm_forest 11137203625256914804
...
Run 128
info(fuzz): done in 16.216s

Nope, still no dice.

Actually, how exactly does it fail on our fuzzing infra?

"debug": "process.Child.Term{ .Signal = 11 }"

But that is SIGSEGV, a segmentation fault! Sounds bad?

Well, I bet that that’s just a stack overflow, that’s the most common source of scrary looking segfaults in my experience. Are we almost bumping into the stack limit here? Let’s try to repro with smaller stack.

$ ulimit -s
8192
$ ulimit -s 2096
$ ulimit -s
2096

$ ./zig/zig build -Drelease fuzz -- lsm_forest 11137203625256914804
info(lsm_forest_fuzz): Passed!
info(fuzz): done in 16.875s

Hm, so we are no near the default limit of 8MiB… Still, let’s check what’s the actual stack size ulimit for a real fuzzing process on our infra?

root@95-216-12-250:~# cat /proc/322972/limits
Limit                     Soft Limit           Hard Limit           Units
...
Max stack size            8388608              unlimited            bytes
...

Yeah, ok, could it be that we have a real segfault this time? Not some punny guard page, but an actual, honest-to-Bosch nasal demon? Exciting!

Good thing we have a core dump! Lets look inside:

$ zstd -d core.zst
core.zst            : 1235386368 bytes

$ ./zig/zig build -Drelease fuzz:build -Dprint-exe
/home/matklad/p/tb/work/.zig-cache/o/bf202d80860928a9a9338add95f64972/fuzz

$ cp /home/matklad/p/tb/work/.zig-cache/o/bf202d80860928a9a9338add95f64972/fuzz fuzz

$ gdb fuzz core
Reading symbols from fuzz...

warning: Can't open file /root/tigerbeetle/working/main/.zig-cache/o/bf202d80860928a9a9338add95f64972/fuzz during file-backed mapping note processing

warning: exec file is newer than core file.
[New LWP 143130]
Core was generated by `/root/tigerbeetle/working/main/.zig-cache/o/bf202d80860928a9a9338add95f64972/fuzz lsm_forest 11137203625256914804'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000133fa34 in compiler_rt.stack_probe.zig_probe_stack ()
    at /home/matklad/p/tb/work/zig/lib/compiler_rt/stack_probe.zig:53
53                  asm volatile (

Well, nope, that’s stack_probe, this definitely a punny stack overflow. What’s our stack,
acutally?

(gdb) bt
#0  0x000000000133fa34 in compiler_rt.stack_probe.zig_probe_stack ()
    at /home/matklad/p/tb/work/zig/lib/compiler_rt/stack_probe.zig:53
Backtrace stopped: Cannot access memory at address 0x7ffc39f6aca8
(gdb)

That’s :sad-face:. Ok, let’s try to at least look at the current code:

(gdb) disassemble
Dump of assembler code for function compiler_rt.stack_probe.zig_probe_stack:
   0x000000000133fa20 <+0>:     push   rcx
   0x000000000133fa21 <+1>:     mov    rcx,rax
   0x000000000133fa24 <+4>:     cmp    rcx,0x1000
   0x000000000133fa2b <+11>:    jb     0x133fa49 <compiler_rt.stack_probe.zig_probe_stack+41>
   0x000000000133fa2d <+13>:    sub    rsp,0x1000
=> 0x000000000133fa34 <+20>:    or     DWORD PTR [rsp+0x10],0x0
   0x000000000133fa39 <+25>:    sub    rcx,0x1000
   0x000000000133fa40 <+32>:    cmp    rcx,0x1000
   0x000000000133fa47 <+39>:    ja     0x133fa2d <compiler_rt.stack_probe.zig_probe_stack+13>
   0x000000000133fa49 <+41>:    sub    rsp,rcx
   0x000000000133fa4c <+44>:    or     DWORD PTR [rsp+0x10],0x0
   0x000000000133fa51 <+49>:    add    rsp,rax
   0x000000000133fa54 <+52>:    pop    rcx
   0x000000000133fa55 <+53>:    ret

Yup, definitely a stack probe.

What’s a stack probe, actually? Stack is finite, 8MiB or so typically. So if you declare

var big: [16*1024*1024]u8 = undefined;

then, writing big[0] will access memory outside the stack. The good case here is that you step into some uninitialized and unmapped memory, and get a segmentation fault. The bad case is steping into a mmaped memory, like a heap or a stack of another thread.

The idea of a stack probe is to force the program to step into unmapped memory, guaranteeing a safe segfault. This needs two components. First, langauge runtime mmaps unreadable & unwritable memory just after the stack ends. Then, the compiler promises that, whenver user writes something like that big above, the compiler will insert stack probe code before the thing is used. And the stack probe will just touch every page in the stack, sequentially, guaranteeing that we hit the guard page before we corrupt the heap or something:

+------------------+ higher addresses
|      STACK       |
+------------------+
         ⋮
+------------------+
|    HOT LAVA      | (guard page)
+------------------+
         ⋮
         ⋮
         ⋮
+------------------+
|      HEAP        |
+------------------+ lower addresses

Ok, that’s the theory! And here’s practice, again:

(gdb) bt
#0  0x000000000133fa34 in compiler_rt.stack_probe.zig_probe_stack ()
    at /home/matklad/p/tb/work/zig/lib/compiler_rt/stack_probe.zig:53
Backtrace stopped: Cannot access memory at address 0x7ffc39f6aca8
(gdb) disassemble
Dump of assembler code for function compiler_rt.stack_probe.zig_probe_stack:
   0x000000000133fa20 <+0>:     push   rcx
   0x000000000133fa21 <+1>:     mov    rcx,rax
   0x000000000133fa24 <+4>:     cmp    rcx,0x1000
   0x000000000133fa2b <+11>:    jb     0x133fa49 <compiler_rt.stack_probe.zig_probe_stack+41>
   0x000000000133fa2d <+13>:    sub    rsp,0x1000
=> 0x000000000133fa34 <+20>:    or     DWORD PTR [rsp+0x10],0x0
   0x000000000133fa39 <+25>:    sub    rcx,0x1000
   0x000000000133fa40 <+32>:    cmp    rcx,0x1000
   0x000000000133fa47 <+39>:    ja     0x133fa2d <compiler_rt.stack_probe.zig_probe_stack+13>
   0x000000000133fa49 <+41>:    sub    rsp,rcx
   0x000000000133fa4c <+44>:    or     DWORD PTR [rsp+0x10],0x0
   0x000000000133fa51 <+49>:    add    rsp,rax
   0x000000000133fa54 <+52>:    pop    rcx
   0x000000000133fa55 <+53>:    ret
End of assembler dump.
(gdb)

Looking closer at the assembly, I notice something. The stack probe works by modifying the stack pointer, rsp, in-place. As per comment in the source,

// %rax = probe length, %rsp = stack pointer

so what we do is that we set rcx to an “iteraton counter”, rax:

mov    rcx,rax

and then we subtract from both rcx and rsp in increments of 0x1000, which is 4KiB, a page size:

sub    rsp,0x1000
sub    rcx,0x1000

When rcx goes to zero, we restore rsp:

add    rsp,rax

That is if everything goes right. Which it didn’t. We crashed in the middle of the loop, so our rsp is all messed up. But we can unscramble this egg:

(gdb) info registers rsp
rsp            0x7ffc39f6aca8      0x7ffc39f6aca8
(gdb) info registers rcx
rcx            0x67b8              26552
(gdb) info registers rax
rax            0x207b8             133048
$ python
>>> hex((0x7ffc39f6aca8 + 0x1000) - 0x67b8 + 0x207b8)
'0x7ffc39f85ca8'

And we can ask our local llm to write a Python script to patch the core file:

$ q "write me a Python script to read core file with x86_64 core dump file and patch it to
    change the value of rsp register from 0x7ffc39f6aca8 to 0x7ffc39f85ca8" > main.py

$ cat main.py
with open('core', 'rb') as f:
    data = bytearray(f.read())

old = b'\xa8\xac\xf6\x39\xfc\x7f'
new = b'\xa8\x5c\xf8\x39\xfc\x7f'

pos = data.find(old)
if pos >= 0:
    data[pos:pos+len(old)] = new

with open('core', 'wb') as f:
    f.write(data)

Let’s try for backtrace again!

$ python main.py
$ gdb fuzz core
Moby λ gdb fuzz core
Reading symbols from fuzz...

warning: Can't open file /root/tigerbeetle/working/main/.zig-cache/o/bf202d80860928a9a9338add95f64972/fuzz during file-backed mapping note processing
[New LWP 143130]
Core was generated by `/root/tigerbeetle/working/main/.zig-cache/o/bf202d80860928a9a9338add95f64972/fuzz lsm_forest 11137203625256914804'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000133fa34 in compiler_rt.stack_probe.zig_probe_stack ()
    at /home/matklad/p/tb/work/zig/lib/compiler_rt/stack_probe.zig:53
53                  asm volatile (
(gdb) bt
#0  0x000000000133fa34 in compiler_rt.stack_probe.zig_probe_stack ()
    at /home/matklad/p/tb/work/zig/lib/compiler_rt/stack_probe.zig:53
#1  0x00007ffc39f9bea8 in ?? ()
#2  0x00000000012b7617 in sort.block.block__anon_41114 () at /home/matklad/p/tb/work/zig/lib/std/sort/block.zig:105
#3  0x00000000012381a6 in mem.sort__anon_40580 () at /home/matklad/p/tb/work/zig/lib/std/mem.zig:567
#4  lsm.table_memory.TableMemoryType(lsm.table.TableType(u64,state_machine.StateMachineType(testing.storage.Storage,.{ .release = .{ ... }, .message_body_size_max = 3840, .lsm_compaction_ops = 4, .vsr_operations_reserved = 128 }).AccountEvent,(function 'key_from_value'),18446744073709551615,(function 'tombstone'),(function 'tombstone_from_key'),120,.general)).sort_suffix_from_offset (offset=0) at /home/matklad/p/tb/work/src/lsm/table_memory.zig:232
#5  0x0000000001163365 in lsm.table_memory.TableMemoryType(lsm.table.TableType(u64,state_machine.StateMachineType(testing.storage.Storage,.{ .release = .{ ... }, .message_body_size_max = 3840, .lsm_compaction_ops = 4, .vsr_operations_reserved = 128 }).AccountEvent,(function 'key_from_value'),18446744073709551615,(function 'tombstone'),(function 'tombstone_from_key'),120,.general)).mutable_sort_suffix_from_offset (offset=0) at /home/matklad/p/tb/work/src/lsm/table_memory.zig:222
#6  lsm.table_memory.TableMemoryType(lsm.table.TableType(u64,state_machine.StateMachineType(testing.storage.Storage,.{ .release = .{ ... }, .message_body_size_max = 3840, .lsm_compaction_ops = 4, .vsr_operations_reserved = 128 }).AccountEvent,(function 'key_from_value'),18446744073709551615,(function 'tombstone'),(function 'tombstone_from_key'),120,.general)).sort_suffix ()
    at /home/matklad/p/tb/work/src/lsm/table_memory.zig:212
...
#21 fuzz_tests.main () at fuzz_tests.zig:73
#22 start.callMain () at /home/matklad/p/tb/work/zig/lib/std/start.zig:524
#23 start.callMainWithArgs () at /home/matklad/p/tb/work/zig/lib/std/start.zig:482
#24 start.posixCallMainAndExit () at /home/matklad/p/tb/work/zig/lib/std/start.zig:438
#25 0x00000000010566b2 in start._start () at /home/matklad/p/tb/work/zig/lib/std/start.zig:266

Yeah! I got it! That’s the backtrace. So whose the culprit? A giant array on the stack from
Zig’s sort:

var cache: [512]T = undefined;

We have a large T, AccountEvent, and 512 of those add up to roughly 133048, the rax from our stack probe disassembly!

As was foretold

So this pretty much your typical stack overflow. The stack’s not too deep, but the leaf function makes a giant stack allocation. Compiler invokes stack probe before using that allocation, and that hits the guard page. Which gives the segfault.

Except that, we’ve measured indirectly that we should have plenty of stack. How much of the stack we are actually using here?

#25 0x00000000010566b2 in start._start () at /home/matklad/p/tb/work/zig/lib/std/start.zig:266
(gdb) frame 0
#0  0x000000000133fa34 in compiler_rt.stack_probe.zig_probe_stack ()
    at /home/matklad/p/tb/work/zig/lib/compiler_rt/stack_probe.zig:53
53                  asm volatile (
(gdb) set $topsp=$sp
(gdb) frame 25
#25 0x00000000010566b2 in start._start () at /home/matklad/p/tb/work/zig/lib/std/start.zig:266
266         asm volatile (switch (native_arch) {
(gdb) print (char *)$sp - (char *)$topsp
$1 = 427656
(gdb)

That’s just shy of half a meg! Granted, not a tiny amout of stack, but well within what should be allowed. Let’s get it from another angle, what does the process address space look like?

(gdb) maintenance info sections
Exec file: `/home/matklad/p/tb/work/fuzz', file type elf64-x86-64.
 [0]      0x01000240->0x0103130e at 0x00000240: .rodata ALLOC LOAD READONLY DATA HAS_CONTENTS
 ...
 [18]     0x00000000->0x00000013 at 0x0164f298: .comment READONLY HAS_CONTENTS
Core file: `/home/matklad/p/tb/work/core', file type elf64-x86-64.
 [0]      0x00000000->0x00001174 at 0x00000430: note0 READONLY HAS_CONTENTS
 [1]      0x00000000->0x000000d8 at 0x000004b4: .reg/143130 HAS_CONTENTS
 [2]      0x00000000->0x000000d8 at 0x000004b4: .reg HAS_CONTENTS
 [3]      0x00000000->0x00000080 at 0x00000644: .note.linuxcore.siginfo/143130 HAS_CONTENTS
 [4]      0x00000000->0x00000080 at 0x00000644: .note.linuxcore.siginfo HAS_CONTENTS
 [5]      0x00000000->0x00000150 at 0x000006d8: .auxv HAS_CONTENTS
 [6]      0x00000000->0x000001b8 at 0x0000083c: .note.linuxcore.file/143130 HAS_CONTENTS
 [7]      0x00000000->0x000001b8 at 0x0000083c: .note.linuxcore.file HAS_CONTENTS
 [8]      0x00000000->0x00000200 at 0x00000a08: .reg2/143130 HAS_CONTENTS
 [9]      0x00000000->0x00000200 at 0x00000a08: .reg2 HAS_CONTENTS
 [10]     0x00000000->0x00000988 at 0x00000c1c: .reg-xstate/143130 HAS_CONTENTS
 [11]     0x00000000->0x00000988 at 0x00000c1c: .reg-xstate HAS_CONTENTS
 [12]     0x01000000->0x01001000 at 0x00002000: load1a ALLOC LOAD READONLY HAS_CONTENTS
 [13]     0x01001000->0x01056000 at 0x00003000: load1b ALLOC READONLY
 [14]     0x01056000->0x01340000 at 0x00003000: load2 ALLOC READONLY CODE
 [15]     0x01340000->0x01341000 at 0x00003000: load3 ALLOC LOAD HAS_CONTENTS
 [16]     0x01341000->0x01342000 at 0x00004000: load4 ALLOC LOAD HAS_CONTENTS
 [17]     0x01342000->0x01346000 at 0x00005000: load5 ALLOC LOAD HAS_CONTENTS
 [18]     0x7ffbf03a3000->0x7ffbf8806000 at 0x00009000: load6 ALLOC LOAD HAS_CONTENTS
 [19]     0x7ffbf8808000->0x7ffbf8809000 at 0x0846c000: load7 ALLOC LOAD HAS_CONTENTS
 [20]     0x7ffbf8931000->0x7ffc3898b000 at 0x0846d000: load8 ALLOC LOAD HAS_CONTENTS
 [21]     0x7ffc3898c000->0x7ffc389a7000 at 0x484c7000: load9 ALLOC LOAD HAS_CONTENTS
 [22]     0x7ffc389a8000->0x7ffc389d0000 at 0x484e2000: load10 ALLOC LOAD HAS_CONTENTS
 [23]     0x7ffc389d1000->0x7ffc389eb000 at 0x4850a000: load11 ALLOC LOAD HAS_CONTENTS
 [24]     0x7ffc389ec000->0x7ffc38a23000 at 0x48524000: load12 ALLOC LOAD HAS_CONTENTS
 [25]     0x7ffc38a25000->0x7ffc39acc000 at 0x4855b000: load13 ALLOC LOAD HAS_CONTENTS
 [26]     0x7ffc39ad0000->0x7ffc39e6b000 at 0x49602000: load14 ALLOC LOAD HAS_CONTENTS
 [27]     0x7ffc39f6b000->0x7ffc39ff0000 at 0x4999d000: load15 ALLOC LOAD HAS_CONTENTS
 [28]     0x7ffc39ffa000->0x7ffc39ffe000 at 0x49a22000: load16 ALLOC LOAD READONLY HAS_CONTENTS
 [29]     0x7ffc39ffe000->0x7ffc3a000000 at 0x49a26000: load17 ALLOC LOAD READONLY CODE HAS_CONTENTS

Or, enhancing the region of interest:

0x7ffc38a25000->0x7ffc39acc000 at 0x4855b000: load13 ALLOC LOAD HAS_CONTENTS
0x7ffc39ad0000->0x7ffc39e6b000 at 0x49602000: load14 ALLOC LOAD HAS_CONTENTS --  36 92KiB of something
                                                              ^
0x7ffc39f6aca8 <- address stack probe steps into and KABOOM   | 1MiB gap
                                                              V
0x7ffc39f6b000->0x7ffc39ff0000 at 0x4999d000: load15 ALLOC LOAD HAS_CONTENTS -- 532KiB of stack
0x7ffc39ffa000->0x7ffc39ffe000 at 0x49a22000: load16 ALLOC LOAD READONLY HAS_CONTENTS
0x7ffc39ffe000->0x7ffc3a000000 at 0x49a26000: load17 ALLOC LOAD READONLY CODE HAS_CONTENTS

It was later revealed to me in a dream that my knowledge about stack guards was all lies and dangerous half truths.

There’s no guard page for the main thread. The hot lava is invisible. There’s no stack either! Remeber how simpletons complain that 8MiB for a stack is too large, and the learned men correct that that’s just virtual memory? Obtuse lore, the kernel doesn’t even bother mmaping the main stack! The mapping itself is materialized on demand. And the kernel segfaults the program if it dares to touch memory close enough to the next mapping after the stack.

How close? One page, originally, but, because the C people don’t tend to care about stack probes, that was extended to 256 pages after The Stack Clash of 2017.

But 256 pages of 4 KiB matches the 1MiB gap I saw. There’s a glimmer of hope, a faint dream of fining a way out of this rabbit hole.

The program starts with a small stack. It extends it, one page at a time, with a stack probe. Immediately after it gets within 1MiB of the next mmaped region, it is segfaulted.

Who mapped that region? And what is there?

(gdb) dump binary memory memory.bin 0x7ffc39ad0000 0x7ffc39e6b000
$ hexyl memory.bin
│00390000│ 00 f0 e5 39 fc 7f 00 00 ┊ 02 00 02 00 aa aa aa aa │⋄××9ו⋄⋄┊•⋄•⋄××××│
│00390010│ 03 aa 58 05 80 07 03 03 ┊ 00 00 00 00 00 00 00 00 │•×X•ו••┊⋄⋄⋄⋄⋄⋄⋄⋄│
│00390020│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │⋄⋄⋄⋄⋄⋄⋄⋄┊⋄⋄⋄⋄⋄⋄⋄⋄│
│*       │                         ┊                         │        ┊        │
│00391000│ aa aa aa aa aa aa aa aa ┊ aa aa aa aa aa aa aa aa │××××××××┊××××××××│
│*       │                         ┊                         │        ┊        │
│00391800│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │⋄⋄⋄⋄⋄⋄⋄⋄┊⋄⋄⋄⋄⋄⋄⋄⋄│
│*       │                         ┊                         │        ┊        │
│00391eb0│ 00 00 00 00 00 00 00 00 ┊ aa aa aa aa aa aa aa aa │⋄⋄⋄⋄⋄⋄⋄⋄┊××××××××│
│00391ec0│ aa aa aa aa aa aa aa aa ┊ aa aa aa aa aa aa aa aa │××××××××┊××××××××│
│*       │                         ┊                         │        ┊        │
│00392000│ 00 10 e6 39 fc 7f 00 00 ┊ 02 00 02 00 aa aa aa aa │⋄•×9ו⋄⋄┊•⋄•⋄××××│
│00392010│ 03 aa 80 07 b8 06 03 03 ┊ 00 00 00 00 00 00 00 00 │•×וו••┊⋄⋄⋄⋄⋄⋄⋄⋄│
│00392020│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │⋄⋄⋄⋄⋄⋄⋄⋄┊⋄⋄⋄⋄⋄⋄⋄⋄│
│*       │                         ┊                         │        ┊        │
│00393000│ 00 00 00 00 aa aa aa aa ┊ aa aa aa aa aa aa aa aa │⋄⋄⋄⋄××××┊××××××××│
│00393010│ aa aa aa aa aa aa aa aa ┊ aa aa aa aa aa aa aa aa │××××××××┊××××××××│
│*       │                         ┊                         │        ┊        │
│00393400│ 00 00 00 00 aa aa aa aa ┊ aa aa aa aa aa aa aa aa │⋄⋄⋄⋄××××┊××××××××│
│00393410│ aa aa aa aa aa aa aa aa ┊ aa aa aa aa aa aa aa aa │××××××××┊××××××××│
│*       │                         ┊                         │        ┊        │
│00393800│ 00 00 00 00 aa aa aa aa ┊ aa aa aa aa aa aa aa aa │⋄⋄⋄⋄××××┊××××××××│
│00393810│ aa aa aa aa aa aa aa aa ┊ aa aa aa aa aa aa aa aa │××××××××┊××××××××│
│*       │                         ┊                         │        ┊        │
│00393c00│ 00 00 00 00 aa aa aa aa ┊ aa aa aa aa aa aa aa aa │⋄⋄⋄⋄××××┊××××××××│
│00393c10│ aa aa aa aa aa aa aa aa ┊ aa aa aa aa aa aa aa aa │××××××××┊××××××××│
│*       │                         ┊                         │        ┊        │
│00394000│ 00 30 e6 39 fc 7f 00 00 ┊ 04 00 04 00 aa aa aa aa │⋄0×9ו⋄⋄┊•⋄•⋄××××│
│00394010│ 0f aa 60 03 60 03 60 03 ┊ 60 03 02 02 02 02 aa aa │•×`•`•`•┊`•••••××│
│00394020│ 00 00 00 00 00 00 00 00 ┊ 00 00 00 00 00 00 00 00 │⋄⋄⋄⋄⋄⋄⋄⋄┊⋄⋄⋄⋄⋄⋄⋄⋄│

This is the raw memory, the zeros, the aa of undefineds, pointers like 00 10 e6 39 fc 7f who try to look different, but are mostly the same.

Who allocated it? I don’t know. My program is simple. It makes hardly a syscall. It’s pure functional number crunching in a virtual, simulated world, without any side effects.

Could it be a bug in the matrix the kernel? Did it forget all about the invisible main thread’s stack and just mmapped some allocation over it? I will continue this search tomorrow.

And the Raven, never flitting, still is sitting, still is sitting
On the pallid bust of Pallas just above my chamber door;
And his eyes have all the seeming of a demon’s that is dreaming,
And the lamp-light o’er him streaming throws his shadow on the floor;
And my soul from out that shadow that lies floating on the floor

         Shall be lifted—nevermore!

20 Likes

To preserve my sanity, I had to seal the rabbit hole:

I couldn’t bring myself to deleting the original core dump file. It will be waiting for the next foolish adventurer.

https://github.com/matklad/repros/releases/download/artifacts/core.zst

4 Likes