Exploiting inline for to avoid stack curruption

From the source of std.meta.Tuple we see the following:

pub fn Tuple(comptime types: []const type) type {
    return CreateUniqueTuple(types.len, types[0..types.len].*);
}

fn CreateUniqueTuple(comptime N: comptime_int, comptime types: [N]type) type {
    var tuple_fields: [types.len]std.builtin.Type.StructField = undefined;
    inline for (types, 0..) |T, i| {
        @setEvalBranchQuota(10_000);
        var num_buf: [128]u8 = undefined;
        tuple_fields[i] = .{
            .name = std.fmt.bufPrintZ(&num_buf, "{d}", .{i}) catch unreachable,
            .type = T,
            .default_value_ptr = null,
            .is_comptime = false,
            .alignment = 0,
        };
    }

    return @Type(.{
        .@"struct" = .{
            .is_tuple = true,
            .layout = .auto,
            .decls = &.{},
            .fields = &tuple_fields,
        },
    });
}

We see that a reference to num_buf is being re-used to generate field names. In a normal for loop, the next iteration will overwrite this stack memory, but it appears that this loop being inline is used to avoid this problem.

Is this officially approved behavior? It feels sketchy to me, because I typically associate inline as an optimization.

inline isn’t an optimisation, it can be, it can also be the opposite.
Not sure if it would be considered approved usage of inline, but I doubt it’d be removed as that would make the semantics of inline more complicated, less intuitive and probably make the implementation harder.
At least in my opinion

Note that the function in question returns a type, so the entire function is executed in a comptime context. At comptime the rules for memory lifetimes are a bit more relaxed.
e.g. the following code is fine:

const result: []const u8 = comptime blk: {
    // This array does not live on the normal stack.
    var arr: [128]u8 = undefined;
    ...
    break :blk arr[0..]; // So we can use it outside the current scope.
};

The compiler keeps track of which pieces of comptime memory are reused in a runtime context, and then it will put them into the read-only section of the executable.

This is a really useful concept, as it allows you to initialize data structures at compile-time. A good example for this is the StaticStringMap.initComptime().

4 Likes

The inline qualifier for the for loop is redundant in that context and can be omitted because the return type type being a comptime-only type already forces the function to be evaluated at compile time, and comptime variables have static lifetimes (like @IntegratedQuantum said).

inline for (as well as inline while and inline else for switches) can be used as a loop-unrolling optimization, but it’s primarily meant for situations where you need a loop over comptime-only values to have runtime side effects.

However, if we take a runtime loop like this

pub fn main() void {
    var x: *const usize = &0;
    for (0..10) |i| {
        var y: usize = i;
        if (y == 5) {
            x = &y;
        }
    }
    @import("std").debug.print("{d}\n", .{x.*});
}

we get different results depending on whether we use for (prints 9) or inline for (prints 5).

It is not super clear from a language specification standpoint whether this is intentional and legal, or if it’s just a compiler implementation detail. In fact, the rules for lifetimes of block-scoped variables are not very clear either, and I don’t know whether something like this would be considered legal or UB:

pub fn main() void {
    var x: *const i32 = &0;
    {
        var y: i32 = 1;
        x = &y;
    }
    @import("std").debug.print("{d}\n", .{x.*});
}

I would personally avoid letting references to local variables leak out of the variable’s enclosing scope (unless we’re in comptime as previously discussed).

4 Likes

The compiler appears to deliberately allocate enough stack space for all variables from all nested scopes. Code like this:

const std = @import("std");

pub fn main() void {
    var x: *const i32 = &0;
    {
        var y: i32 = 1;
        x = &y;
        std.debug.print("{p}\n", .{&y});
    }
    {
        var y: i32 = 2;
        std.debug.print("{p}\n", .{&y});
    }
    std.debug.print("{d}\n", .{x.*});
}

prints output like:

i32@7ffc2e06ba9c
i32@7ffc2e06baac
1

and the layout of the two y variables doesn’t change if we remove x completely; they are always spaced by 0x10 bytes.

Still, this is definitely something that should be specified explicitly. Dereferencing a pointer to a variable that goes out of scope certainly smells like an UB, so if there are situations where this is permitted – whether always within a single stack frame, or through some C++/Rust-like mechanism of prolonging the lifetime of temporaries – then we need a clarification as to when exactly it is the case.

1 Like