Is sentinel important for many-item pointers?

I haven’t found any meaningfulness of sentinels of many-item pointers.
The sentinel might be meaningful for slices, but when slicing a many-item pointer with sentinel to get a slice or a pointer to array, the sentinel info will be always lost.

const print = @import("std").debug.print;

var n: u8 = 1;

pub fn main() !void {
    const a: [2:99]u8 = .{1, 2};
    const pm = a[n..].ptr;
    print("{}\n", .{ @TypeOf(pm) });     // [*:99]const u8
    print("{} {}\n", .{ pm[0], pm[1] }); // 2 99
    const pm2 = pm[n..];
    print("{}\n", .{ @TypeOf(pm2) });      // [*:99]const u8
    print("{} {}\n", .{ pm2[0], pm2[1] }); // 99 0
    
    const p = pm2[0..1];
    print("{}\n", .{ @TypeOf(p) }); // *const [1]u8
    const s = pm2[0..n];
    print("{}\n", .{ @TypeOf(s) }); // []const u8
}

If you slice to a specific length you won’t get the sentinel unless you explicitly slice using the ptr[start..end :sentinel] syntax, which asserts that there’s a sentinel with the correct value after end.

In practice, virtually every piece of code that consumes sentinel-terminated many-item pointers will compute the length of the string/sequence by iterating over the elements until it reaches the sentinel, and then slice using the computed length:

const std = @import("std");

pub fn main() void {
    const sentinel: u8 = 99;
    const ptr: [*:sentinel]const u8 = &.{ 1, 2, 3 };
    std.debug.print("{}\n", .{@TypeOf(ptr)}); // [*:99]const u8

    var len: usize = 0;
    while (ptr[len] != sentinel) len += 1;
    const slice = ptr[0..len :sentinel];
    std.debug.print("{}\n", .{@TypeOf(slice)}); // [:99]const u8
}

In Zig std, you would conventionally use std.mem.len(), std.mem.span() and std.mem.sliceTo() instead of doing the manual while loop. In C, you would conventionally use strlen() to compute the length of a [*:0]const u8 pointer.

Outside of interop with C, sentinel-terminated many-item pointers can be useful if space is a concern. A [*:0]const u8 value takes up half the size of [:0]const u8, so if you have a lot of string data you might choose to only store the pointers and not the lengths, with the trade-off that lengths will need to be re-computed upon use.

3 Likes

Sentinels will rarely be used when writing zig. They are useful when interacting with C code which expects null-terminated strings

2 Likes

Aha, thanks. TIL the new syntax.