I recall reading somewhere from the zig core team that making returning pointers to stack memory (“dangling pointers”) a compile error was “difficult from a language specification perspective.”
Could someone elaborate more?
Are there legitimate use-cases for returning pointers to stack memory that would be difficult to carve-out space for in the language specification? Could this not be solved by requiring @intFromPtr to return them?
I think @kj4tmp is asking about how difficult would it be to additionally analyze bodies of functions that return pointers, like fn get() *i32, such that analyzing functions that return pointers to stack memory would through a compile error.
For instance, analyzing either of these functions would be a compile error:
Yeah, I would ban that case as well. But the Sema’s logic might be that since we’re returning a pointer to a pointer to i32, and the returned i32 value isn’t on stack, then we can make it a valid case, as it is now:
is also a valid case, with the same logic applied. But object (pointer in this case) should be placed somewhere? Where this pointer is stored? Your example crashes on ReleaseSafe. I think it reads leftover value from stack. I don’t think sema really knows if pointer points to heap allocated object.
Edited your example slightly for easier debugging:
It indeed returns pointer to stack memory:
lea rcx, [rbp - 32] ; get address of `x_ptr` and write it to `rcx` (`ptr`)
mov qword ptr [rbx], rcx ; put resulting address of stack allocated `ptr` to `rbx`
Both of these functions are doing perfectly legal things, on their own. Passing a pointer down-stack is perfectly ok, taking a pointer and returning a struct incorporating it is also a-ok. Returning a value directly from a called function? Absolutely normal behavior.
This being the simplest possible example, is already far more difficult to catch than it may initially look like:
It’s seems to my untrained eye that if the problems solved by Proposal: restricted function types · Issue #23367 · ziglang/zig · GitHub are actually solved (the compiler knowing the full function call graph), then the compiler could also know the lifetimes of all stack allocated memory, and thus detect pointers to invalid stack allocated memory.
Consider the example of an acyclic function call graph. The compiler would only need to check that a given stack variable pointer does not traverse upwards from its point of creation? I’m guessing it looks a bit like reference counting.
If I understand you correctly, you are saying the reason that use-after-stack-lifetime-expired can not be a compile error is similar to the reason that std.debug.assert(false) cannot be a compile error?
In other words, the (unwritten) language specification currently catagorizes use-after-stack-lifetime-expired as undefined behavior?
This situation could be caught by an eager analysis, similar to how “unused function parameter” is currently analyzed? The compiler need only prove that it is possible based on the AST? Maybe that would be annoying due to false positives.
Seems like there’s a difference between using &x when x is on the stack and when x is not on the stack. I don’t normally suggest language features, but how awful would it be if we had to write it as @someNewTypeOfCast(&x) before using it as a pointer, even for calling some x.func()?
I know it’s functionally just as easy to make a mistake, but with this I could audit them, and maybe along the way I put a few more things into an arena and off the stack. Even a FixedBufferAllocator-backed arena that uses stack memory reduces the number of places to look for this error.
I’m with @kj4tmp , it’s not as hard as people make it out. Gcc and clang’s static analyzer catch these very easily.
Catching every possible situation is unsolvable, as has been pointed out, like due to external functions. But most bugs of this type are not complicated. We could catch the vast majority of them with interprocedural analysis, though it could be costly. But I think even a naive approach, like just looking at the AST, would already catch a whole lot of bugs, while being cheap.
There are no false positives with the AST approach (though it would obviously have a lot of false negatives). And we could just design the interprocedural analysis to not have false positives, that is, do not complain unless you can prove with absolute certainty there is a dangling pointer (once again, such a high bar would mean a lot of false negatives).