Pointers to Temporary Memory

AndrewCodeDev · March 5, 2024, 11:14pm

I feel like it’s one of those “harder than it sounds” problems but I’m sure there’s some low hanging fruit in the tree of all possible cases.

Another thing that we haven’t ever recommended is just copying buffers out via direct return. If you only do it a couple times during the life of a program (or maybe even once depending on the circumstance) then it’s not a heavy tax. Especially if it’s a small buffer of bytes. Most of our material is about slices, so I think that’s what people go for first.

pierrelgol · March 6, 2024, 1:59pm

One alternative, could be to put that burden onto ZLS as I think it might actually be easier to detect those issues with an lsp, rather than with the actual compiler, on top of that I don’t know how badly that would slow down compilation time, because if I remember correctly, the recent shift in the compiler’s internal design, is to reduce dependency chain in the whole pipeline to improve compilation speed, which at least instinctively doesn’t sound like this would be compatible with the whole (identifying variables addresses “type” (stack/heap) and returns of incorrect reference to said memory.

chung-leong · March 6, 2024, 2:25pm

I don’t think it’s possible due to the halting problem.

dee0xeed · March 6, 2024, 6:52pm

If it’s not possible (in general case) at compile time then it’s also not possible (in general case) at run time. But it can be possible. Consider our examples.

Example 1 (trivial to catch the footgun)

what is being returned? &x
where is x? It’s on the stack
=> error

gonzo example (more complicated and I will use my C analog here)

what is being returned? b and it is not an address, ok
what is b? It’s a Bar structure with id (not a pointer, skip) and array of Foos
Foo contains a pointer to some other Foo
how these pointers are initialized? they are assigned an address of some element in the same array
where is this array? it’s on the stack
analysis complete => error

I do realize it’s very easy to do such investigations “manually” and for particular cases and it’s not that easy to invent some formal general algorithm for detecting stack addresses in what is being returned, but it’s somehow possible, at least for simple cases. It’s better than nothing

chung-leong · March 7, 2024, 4:29pm

That’s an odd assertion. How would our inability to predict whether an event occurs or not preclude us from observing it when it does?

As in the case of the halting problem, static analysis can’t tell us whether a conditional branch would eventually be taken. Consider the following:

    var string: []u8 = &local_buffer;
    var string_in_stack: bool = true;
    // ...
    if (string_in_stack) {
       string = allocator.dupe(u8, string);
    }
    return string;

Static analysis would reveal that string can point to both the stack and the heap. Which is it when the return statement is reached? We don’t really know. And in the case of @ptrFromInt(), we don’t know at all what it points to.

dee0xeed · March 8, 2024, 5:02pm

For your example, as it is, it would reveal that string is always on stack, since this kinda strange string_in_stack is actually a constant, it’s not mutated. Could you give some “full” example, where a program (at source level) is actually tracking where an object is allocated, on heap or on stack?

JPL · March 8, 2024, 5:55pm

“A topic that should be included in a documentation section on the forum; I think there’s a need for a documentation category where only admins can post them.”

Sze · March 8, 2024, 6:39pm

I don’t understand the quotes, are you quoting somebody?
If the content is your own suggestion, that seems like it should be another topic in the Site Feedback category.

dee0xeed · March 8, 2024, 7:34pm

me too, an implied quoted person is definitely not me…

JPL · March 8, 2024, 9:40pm

and dee0xeed

I found the explanation on this theme “Pointers to Temporary Memory” very informative, if there was a DOC section with this kind of theme, that would be nice.

No one is to blame, just a suggestion

J’ai trouvé l’explication sur ce thème “Pointers to Temporary Memory” , très instructif , s’il y avait une rubrique DOC avec ce genre de thème, cela serait sympathique.

Personnes n’est en cause, juste une suggestion

Sze · March 8, 2024, 11:15pm

This is the discussion topic of the doc here Pointers to Temporary Memory

Do you think something from this discussion is missing / should be added to the opening post?

JPL · March 9, 2024, 1:49am

Indeed, I find this discussion very pragmatic

zhangkaizhao · April 11, 2024, 12:59pm

Off topic: Example 4 in the doc seems like a little outdated.

const std = @import("std");
const log = std.debug.print;

const ToyStr = struct {

    const CAP: usize = 9;
    buf: [CAP]u8 = undefined,

    // note that self is passed by value
    fn sliceMeNice(self: ToyStr, from: usize, to: usize) []const u8 {
        log("inside: {s}\n", .{self.buf[from .. to]});
        return self.buf[from .. to];
    }
};

const ToyStrP = struct {

    const CAP: usize = 9;
    buf: [CAP]u8 = undefined,

    // note that self is passed by reference
    fn sliceMeNice(self: *ToyStrP, from: usize, to: usize) []const u8 {
        log("inside-p: {s}\n", .{self.buf[from .. to]});
        return self.buf[from .. to];
    }
};

const ToyStr2 = struct {

    const CAP: usize = 19;
    buf: [CAP]u8 = undefined,

    fn sliceMeNice(self: ToyStr2, from: usize, to: usize) []const u8 {
        log("inside2: {s}\n", .{self.buf[from .. to]});
        return self.buf[from .. to];
    }
};

const ToyStr2P = struct {

    const CAP: usize = 19;
    buf: [CAP]u8 = undefined,

    // note that self is passed by reference
    fn sliceMeNice(self: *ToyStr2P, from: usize, to: usize) []const u8 {
        log("inside2-p: {s}\n", .{self.buf[from .. to]});
        return self.buf[from .. to];
    }
};

pub fn main() !void {
    var ts = ToyStr{};
    @memcpy(ts.buf[0..], "aaabbbccc");
    const s = ts.sliceMeNice(0,6);
    log("outside: {s}\n", .{s});

    var ts_p = ToyStrP{};
    @memcpy(ts_p.buf[0..], "aaabbbccc");
    const s_p = ts_p.sliceMeNice(0,6);
    log("outside-p: {s}\n", .{s_p});

    var ts2 = ToyStr2{};
    @memcpy(ts2.buf[0..], "aaabbbccc0123456789");
    const s2 = ts2.sliceMeNice(0,6);
    log("outside2: {s}\n", .{s2});

    var ts2_p = ToyStr2P{};
    @memcpy(ts2_p.buf[0..], "aaabbbccc0123456789");
    const s2_p = ts2_p.sliceMeNice(0,6);
    log("outside2-p: {s}\n", .{s2_p});
}

Tested in macOS Monterey 12.7.4 x86_64.

WIth Zig 0.11.0 :

% zig run example-4.zig
inside: aaabbb
outside: aaabbb
inside-p: aaabbb
outside-p: aaabbb
inside2: aaabbb
outside2: aaabbb
inside2-p: aaabbb
outside2-p: aaabbb

With Zig 0.12.0-dev.3610+9d27f34d0 (2024-04-10):

% zig run example-4.zig 
inside: aaabbb
outside: aabbb
inside-p: aaabbb
outside-p: aaabbb
inside2: aaabbb
outside2: 0=*
inside2-p: aaabbb
outside2-p: aaabbb

References: