Safe way to return variable-length slice of string literals depending on argument

I have a scenario where I want a function to return one of several fixed slices of string literals, which have different lengths (so I can’t use an array return type), depending on the argument.

Here is a contrived example:

const std = @import("std");

const Amount = enum { one, two };

fn returnSlice(n: Amount) []const []const u8 {
    return switch (n) {
        .one => &.{"hello"},
        .two => &.{"foo", "bar"},
    };
}

pub fn main() void {
    const slice = returnSlice(.two);
    std.debug.print("{s}\n", .{slice});
}

If .one is passed, it prints { hello }, otherwise it prints { foo, bar }. So it seems to work as expected.

However, I am unsure if this is the correct way to do it, since I am returning a slice, i.e. a form of pointer, to a value that I did not allocate on the heap.

I found one case in the Zig compiler sources which is very similar to mine, in src/targets.zig:

pub fn libcFullLinkFlags(target: std.Target) []const []const u8 {
    // The linking order of these is significant and should match the order other
    // c compilers such as gcc or clang use.
    return switch (target.os.tag) {
        .netbsd, .openbsd => &[_][]const u8{
            "-lm",
            "-lpthread",
            "-lc",
            "-lutil",
        },
        // […]

This gives me some confidence, but here they use &[_][]const u8{ … } instead of &.{ … }. Does that make a difference?

Where are the arrays that I am referencing actually stored in memory? Did my code work on accident or is there a footgun waiting to happen (Pointers to Temporary Memory)?

String literals are pointers to statically allocated null-terminated arrays placed along other statically initialized data in the data segment.

Not sure about pointers to arrays of string literals, but my guess would be that the compiler is smart enough to place those string literals back to back letting you return a static slice of them from a function just like a single string literal.

String literals are stored in the .data section. Pointing to them is fine, they never go out of scope.

Yes, I knew about the string literals. I meant the arrays containing pointers to them.

Let me rephrase the code to make it more clear.

I suppose this code should be equivalent:

const s1 = "hello";
const s2 = "foo";
const s3 = "bar";

fn returnSlice(n: Amount) []const []const u8 {
    const a1 = [_][]const u8{s1};
    const a2 = [_][]const u8{s2, s3};

    return switch (n) {
        .one => &a1,
        .two => &a2,
    };
}

Here I am (seemingly) returning pointers to local variables a1 and a2. How can I know if this is safe?

I can further change the code to:

const s1 = "hello";
const s2 = "foo";
const s3 = "bar";

const a1 = [_][]const u8{s1};
const a2 = [_][]const u8{s2, s3};

fn returnSlice(n: Amount) []const []const u8 {
    return switch (n) {
        .one => &a1,
        .two => &a2,
    };
}

Here it is clear that I can safely return &a1 and &a2. Can I rely on the last two snippets being equivalent?

The pointer array is going to be fine too. The compiler will create relocation entries that update the pointers’ values to reflect the actual address of the data segment when the program starts.

Like @chung-leong, it’s fine in this case because evertyhing is comptime-known, and will therefore live in the .data section. This would break if something depended on runtime values. If you want more guarantees, I suggest using the comptime keyword. This will guarantee that the data lives in the .data section rather than on the stack.

const s1 = "hello";
const s2 = "foo";
const s3 = "bar";

fn returnSlice(n: Amount) []const []const u8 {
    const a1 = comptime [_][]const u8{s1};
    const a2 = comptime [_][]const u8{s2, s3};

    return switch (n) {
        .one => &a1,
        .two => &a2,
    };
}
3 Likes