What makes "ban returning pointer to stack memory" difficult?

mnemnion · April 10, 2025, 4:27pm

Yes, but this is required to make the program correct. It retains ptr2, so the memory in the stack frame which ptr2 points to cannot be reused. A compiler must make a program correct, and only then is free to make it as efficient as correctness allows.

There seems to be a hidden premise here which isn’t true, namely, that in the converse situation, where a block is exited without any live pointers to its temporary memory, the compiler is unable to reuse that memory anyway. This simply isn’t the case.

In debug modes, the compiler won’t reuse the memory, so that a debugger can map correctly from a variable name to a memory region. In release modes the compiler is perfectly able to reuse that memory, and will if it’s safe and efficient to do so.

There’s a somewhat weaker, related point, which is that given the possibility of inner-block memory surviving, the compiler loses opportunities to reuse which it would otherwise have, because it can’t do so if it can’t prove that the program is still correct. For instance (I can’t believe we’re still on topic!) passing a block-scoped pointer down the stack might mean that the compiler can’t prove it doesn’t escape, even if it in fact does not.

That’s true. It’s also true that embedded programming needs to pay very close attention to how much stack it’s using. What this means is that embedded programming needs to be aware of this possibility and organize its use of memory accordingly.

Which has always been true, so overall I don’t see the point of optimizing for something which doesn’t need to come up in the first place.

Joen-UnLogick · April 10, 2025, 5:36pm

Honestly I would have preferred a compiled error on b. It’s passing a pointer pointing to something known to not be in scope. This is the bane of stable code. Adding a std.debug.log statement would mess with the stack and make your program fail in release. Yuck.

Joen-UnLogick · April 10, 2025, 5:42pm

Honestly after some thought I think the best pattern would be a compiler feature to track defunct pointers. Preferably one with a built in method. So our allocators destroy could mark the pointer as bad and have the compiler complain if someone still used it. Obviously this would not be fool proof, but it could work in normal simple cases

andrewrk · April 10, 2025, 8:24pm

github.com/ziglang/zig

introduce runtime safety for dangling stack pointers

opened 08:23PM - 10 Apr 25 UTC

andrewrk

enhancement proposal accepted

I didn't see anywhere this was clearly written up so here it is. It's not possi…ble to catch use-after-free of stack memory at compile-time (#5725) because it can be equated to the halting problem. I'll leave that proof as an exercise for the reader. So, we catch it at runtime, in safe build modes. Step 1, do escape analysis. Only stack locals which have pointers captured which might outlive their scope are subject to these safety checks. Step 2, introduce an API for heap-allocating memory, like this: ```zig extern fn safe_alloc(size: usize, alignment: u8) ?[*]u8; extern fn safe_free(ptr: [*]u8, size: usize, alignment: u8, rbp: usize) void; ``` These would default to using a slimmed down version of `std.heap.DebugAllocator` - one that avoids reuse of memory addresses, but does not capture stack traces. Perhaps this would be implemented in compiler_rt so that it could be optimized and be compiled *without* these safety checks, which would otherwise be recursive. The subset of stack values which have possibly escaped pointers would then be allocated this way. As an optimization, if there were multiple escaped values in the stack frame, they could be allocated together and freed together. The allocation function could fail, so the stack slots would still be reserved for such case. Stack base address is passed to `safe_free` so that it can ignore such fallback pointers. Heap-allocating instead of stack-allocating is obviously significantly slower, so that's why it's important for Step 1 to work well, in order to make Step 2 rare. So then, when a dangling stack pointer is used, it either segfaults, or its bytes have been memset to the `0xaa` pattern, making it very likely to immediately trigger a crash. Even if it does not trigger a crash, however, it is still memory safe in the sense that the memory does not alias any other allocations, making it easier to debug in Debug mode, and avoiding a certain class of bugs in ReleaseSafe mode.

Joen-UnLogick · April 10, 2025, 9:30pm

Did I read this correctly? You want to make a hidden heap allocation if you’re unsure there is a problem?

Honestly if we go to such lengths, I’d rather introduce a straight check on the return that the pointer does not belong in the current stack… I mean the stack is a known region of memory space, such a check would be trivial to add in safe builds.

The severity of this problem merits a straight up panic. Even if the pointer isn’t used. It’s a disaster waiting to happen when a new developer joins the project.

matklad · April 10, 2025, 9:47pm

It’s not possible to catch use-after-free of stack memory at compile-time (#5725) because it can be equated to the halting problem. I’ll leave that proof as an exercise for the reader.

I am feeling very strongly about this line of reasoning. It is either a rhetorical fallacy or a reasoning fallacy.

Were this line of reasoning valid, type systems wouldn’t have existed, because, via Rice theorem, you can’t say anything at all about properties of arbitrary programs without stepping into undecidability!

It is a finer claim than that. Every static analysis necessary has false positives, so it is a question of tradeoffs: is there a type system that admits many valid programs, forbids most erroneous programs, and keeps annotation burden low.

You can’t make an absolute judgement here, it must be about tradeoffs.

andrewrk · April 10, 2025, 11:19pm

To be clear I’m talking about Zig’s existing type system and language semantics. The language is not going to grow a borrow checker or something like that. If you have some idea to change the type system or language semantics, let’s discuss it, but the null hypothesis is no changes.

matklad · April 10, 2025, 11:41pm

Yeah, I agree with all that, I only protest dragging halting problem in, which I strongly believe is immaterial here.

mnemnion · April 11, 2025, 5:04pm

An inconvenient truth, but a truth nonetheless.

It’s also a salient and applicable truth, since it’s the reason that systems languages invariably leave gaping holes in the type system, such as unsafe and any Zig builtin ending in Cast. Type systems don’t solve this problem, they reduce its surface area, and of course are valuable on that basis.

As I said initially, the problem of pointer escape is in fact the problem of lifetime analysis. Rust’s lifetime analysis is not sound, and I’m referring to safe Rust here. Perhaps it can be made sound, but that is a claim rather than a fact. In Rust, which includes the unsafe parts of the language, it is of course completely impossible.

The C++ committee is currently tilting at the windmill of lifetime analysis without lifetime types, and that’s going about as well as one might expect.

So (having made the quip first) I consider it both logically and rhetorically sound. Logically, because making the reasonable assumption of “Zig” rather that “not Zig”, it’s simply correct, and rhetorically because it disposes of the question in a somewhat humorous way, to move rapidly to addressing the actual point of the issue, which is catching the problem dynamically in safe modes.

joed · April 14, 2025, 7:12am

How would you like inline functions to be handled? Should this be allowed?

inline fn fooInline() []u8 {
    var buf: [13:0]u8 = "hello, world!".*;
    return &buf;
}

test "inlining is semantic" {
    const foo = fooInline();
    try std.testing.expectEqualStrings("hello, world!", foo);
}

test "which means it should be equivalent to this block" {
    const foo = fooBlock: {
        var buf: [13:0]u8 = "hello, world!";
        break :fooBlock buf;
    };
    try std.testing.expectEqualStrings("hello, world!", foo);
}

That’s how it works at present, as can be seen by the pointers returned from the inline function being distinct, but the non-inline ones being identical:

const std = @import("std");

fn copy(x: *const [5]u8) []const u8 {
    var buf: [5]u8 = x.*;
    return &buf;
}

inline fn copyInline(x: *const [5]u8) []const u8 {
    var buf: [5]u8 = x.*;
    return &buf;
}

pub fn main() void {
    std.debug.print("{*} {*}\n", .{ copy("hello"), copy("world") });
    std.debug.print("{*} {*}\n", .{ copyInline("hello"), copyInline("world") });
}

$ zig run stackmem.zig 
u8@7ffd6acda4c8 u8@7ffd6acda4c8
u8@7ffd6acda558 u8@7ffd6acda568

tensorush · April 14, 2025, 12:19pm

Just watched @ityonemo’s video explaining how clr can detect many “unsafe” operations, like returning stack pointers, at compile-time by statically analyzing Zig’s AIR:

Very cool, thought I’d share here as it relates to the problem statement.