What makes "ban returning pointer to stack memory" difficult?

Yes, but this is required to make the program correct. It retains ptr2, so the memory in the stack frame which ptr2 points to cannot be reused. A compiler must make a program correct, and only then is free to make it as efficient as correctness allows.

There seems to be a hidden premise here which isn’t true, namely, that in the converse situation, where a block is exited without any live pointers to its temporary memory, the compiler is unable to reuse that memory anyway. This simply isn’t the case.

In debug modes, the compiler won’t reuse the memory, so that a debugger can map correctly from a variable name to a memory region. In release modes the compiler is perfectly able to reuse that memory, and will if it’s safe and efficient to do so.

There’s a somewhat weaker, related point, which is that given the possibility of inner-block memory surviving, the compiler loses opportunities to reuse which it would otherwise have, because it can’t do so if it can’t prove that the program is still correct. For instance (I can’t believe we’re still on topic!) passing a block-scoped pointer down the stack might mean that the compiler can’t prove it doesn’t escape, even if it in fact does not.

That’s true. It’s also true that embedded programming needs to pay very close attention to how much stack it’s using. What this means is that embedded programming needs to be aware of this possibility and organize its use of memory accordingly.

Which has always been true, so overall I don’t see the point of optimizing for something which doesn’t need to come up in the first place.

2 Likes

Honestly I would have preferred a compiled error on b. It’s passing a pointer pointing to something known to not be in scope. This is the bane of stable code. Adding a std.debug.log statement would mess with the stack and make your program fail in release. Yuck.

1 Like

Honestly after some thought I think the best pattern would be a compiler feature to track defunct pointers. Preferably one with a built in method. So our allocators destroy could mark the pointer as bad and have the compiler complain if someone still used it. Obviously this would not be fool proof, but it could work in normal simple cases

2 Likes

Did I read this correctly? You want to make a hidden heap allocation if you’re unsure there is a problem?

Honestly if we go to such lengths, I’d rather introduce a straight check on the return that the pointer does not belong in the current stack… I mean the stack is a known region of memory space, such a check would be trivial to add in safe builds.

The severity of this problem merits a straight up panic. Even if the pointer isn’t used. It’s a disaster waiting to happen when a new developer joins the project.

It’s not possible to catch use-after-free of stack memory at compile-time (#5725) because it can be equated to the halting problem. I’ll leave that proof as an exercise for the reader.

I am feeling very strongly about this line of reasoning. It is either a rhetorical fallacy or a reasoning fallacy.

Were this line of reasoning valid, type systems wouldn’t have existed, because, via Rice theorem, you can’t say anything at all about properties of arbitrary programs without stepping into undecidability!

It is a finer claim than that. Every static analysis necessary has false positives, so it is a question of tradeoffs: is there a type system that admits many valid programs, forbids most erroneous programs, and keeps annotation burden low.

You can’t make an absolute judgement here, it must be about tradeoffs.

3 Likes

To be clear I’m talking about Zig’s existing type system and language semantics. The language is not going to grow a borrow checker or something like that. If you have some idea to change the type system or language semantics, let’s discuss it, but the null hypothesis is no changes.

4 Likes

Yeah, I agree with all that, I only protest dragging halting problem in, which I strongly believe is immaterial here.

3 Likes

An inconvenient truth, but a truth nonetheless.

It’s also a salient and applicable truth, since it’s the reason that systems languages invariably leave gaping holes in the type system, such as unsafe and any Zig builtin ending in Cast. Type systems don’t solve this problem, they reduce its surface area, and of course are valuable on that basis.

As I said initially, the problem of pointer escape is in fact the problem of lifetime analysis. Rust’s lifetime analysis is not sound, and I’m referring to safe Rust here. Perhaps it can be made sound, but that is a claim rather than a fact. In Rust, which includes the unsafe parts of the language, it is of course completely impossible.

The C++ committee is currently tilting at the windmill of lifetime analysis without lifetime types, and that’s going about as well as one might expect.

So (having made the quip first) I consider it both logically and rhetorically sound. Logically, because making the reasonable assumption of “Zig” rather that “not Zig”, it’s simply correct, and rhetorically because it disposes of the question in a somewhat humorous way, to move rapidly to addressing the actual point of the issue, which is catching the problem dynamically in safe modes.

2 Likes

How would you like inline functions to be handled? Should this be allowed?

inline fn fooInline() []u8 {
    var buf: [13:0]u8 = "hello, world!".*;
    return &buf;
}

test "inlining is semantic" {
    const foo = fooInline();
    try std.testing.expectEqualStrings("hello, world!", foo);
}

test "which means it should be equivalent to this block" {
    const foo = fooBlock: {
        var buf: [13:0]u8 = "hello, world!";
        break :fooBlock buf;
    };
    try std.testing.expectEqualStrings("hello, world!", foo);
}

That’s how it works at present, as can be seen by the pointers returned from the inline function being distinct, but the non-inline ones being identical:

const std = @import("std");

fn copy(x: *const [5]u8) []const u8 {
    var buf: [5]u8 = x.*;
    return &buf;
}

inline fn copyInline(x: *const [5]u8) []const u8 {
    var buf: [5]u8 = x.*;
    return &buf;
}

pub fn main() void {
    std.debug.print("{*} {*}\n", .{ copy("hello"), copy("world") });
    std.debug.print("{*} {*}\n", .{ copyInline("hello"), copyInline("world") });
}
$ zig run stackmem.zig 
u8@7ffd6acda4c8 u8@7ffd6acda4c8
u8@7ffd6acda558 u8@7ffd6acda568

Just watched @ityonemo’s video explaining how clr can detect many “unsafe” operations, like returning stack pointers, at compile-time by statically analyzing Zig’s AIR:

Very cool, thought I’d share here as it relates to the problem statement.

6 Likes