How to combine runtime and comptime logic in a single function?

timfayz · January 14, 2024, 9:18pm

I have a function that I would like to work in both modes – comptime and runtime. I couldn’t find an example or any recommendations on how to achieve this, whether it is advisable, and how to conduct testing for such functions. Initially, I had separate functions: addSepCT (CT for comptime; idk, maybe there is a better convention) and addSep (for normal runtime). However, I quickly realized I could combine them:

const std = @import("std");

pub fn addSep(alloc: ?std.mem.Allocator, sep: []const u8, args: anytype) ![]const u8 {
    const args_T = @TypeOf(args);
    const args_T_info = @typeInfo(args_T);
    if (args_T_info != .Struct) {
        @compileError("expected tuple or struct, found " ++ @typeName(args_T));
    }

    const args_len = args_T_info.Struct.fields.len;
    const items: [args_len][]const u8 = args;
    if (items.len == 0)
        return "";

    if (@inComptime()) {
        var out: []const u8 = "";
        for (items) |field| {
            if (field.len == 0) continue;
            out = out ++ field ++ sep;
        }
        return out[0..out.len -| sep.len];
    } else {
        if (alloc == null) @panic("alloc cannot be null in runtime mode");
        var out = std.ArrayList(u8).init(alloc.?);
        for (items) |field| {
            if (field.len == 0) continue;
            try out.appendSlice(field);
            try out.appendSlice(sep);
        }
        out.items = out.items[0..out.items.len -| sep.len];
        return try out.toOwnedSlice();
    }
}

test addSep {
    const testFunc = struct {
        const alloc = std.testing.allocator;
        const expectEqual = std.testing.expectEqualSlices;

        pub fn run(expect: []const u8, sep: []const u8, args: anytype) !void {
            if (@inComptime()) {
                const actual = try addSep(null, sep, args);
                try expectEqual(u8, expect, actual);
            } else {
                const actual = try addSep(alloc, sep, args);
                defer alloc.free(actual);
                try expectEqual(u8, expect, actual);
            }
        }
    };

    // comptime
    try comptime testFunc.run("a=b=c", "=", .{ "a", "b", "c" });
    try comptime testFunc.run("a", "=", .{ "a", "" });

    // runtime
    try testFunc.run("a=b=c", "=", .{ "a", "b", "c" });
    try testFunc.run("a", "=", .{ "a", "" });
}

I wonder if I approached it the “Zig way” or if I should have done it at all. What do you think?

IntegratedQuantum · January 14, 2024, 10:03pm

I don’t think this is a case where I would combine functions. The bulk of the functionality(the main loop) is different between both variants. Additionally, you made the interface more complicated by making the Allocator optional.
In cases like this I would instead try to move the common functionality into a new function:

fn argsStructLen(args: anytype) comptime_int {
    const args_T = @TypeOf(args);
    const args_T_info = @typeInfo(args_T);
    if (args_T_info != .Struct) {
        @compileError("expected tuple or struct, found " ++ @typeName(args_T));
    }
    return args_T_info.Struct.fields.len;
}
pub fn addSep(alloc: std.mem.Allocator, sep: []const u8, args: anytype) ![]const u8 {
    const args_len = argsStructLen(args);
    const items: [args_len][]const u8 = args;
// This case is already handled by the use of `-|`.
//    if (items.len == 0) 
//        return "";

    var out = std.ArrayList(u8).init(alloc.?);
    for (items) |field| {
        if (field.len == 0) continue;
        try out.appendSlice(field);
        try out.appendSlice(sep);
    }
    out.items = out.items[0..out.items.len -| sep.len];
    return try out.toOwnedSlice();
}
pub fn addSepCT(sep: []const u8, args: anytype) ![]const u8 {
    if(!@inComptime) @compileError("Must be called at comptime");
    const args_len = argsStructLen(args);
    const items: [args_len][]const u8 = args;

    var out: []const u8 = "";
    for (items) |field| {
        if (field.len == 0) continue;
        out = out ++ field ++ sep;
    }
    return out[0..out.len -| sep.len];
}

As you can see there is now only 2 lines shared between the two functions.

AndrewCodeDev · January 14, 2024, 10:29pm

For the record, this can definitely be a Zig code smell. Atop what @IntegratedQuantum mentioned, there’s an issue with allocators and comptime to begin with. Memory allocation is different between comptime and runtime, so this sends a mixed message.

Let’s just look at this from a combinatorics perspective and only focus on the allocator and comptime options. Ostensibly from the interface, we have these options:

comptime: true, false
has allocator: true, false

That leaves us with 4 possible combinations:

comptime, non-null allocator // invalid
runtime, non-null allocator // valid
comptime, null allocator // valid
runtime, null allocator // invalid

We can see that there are an equal number of ways to parameterize this incorrectly as there are ways to do it correctly. This, in my opinion, warrants splitting the two apart and giving each function only valid parameter sets. Null values (also in my opinion) do not convey enough information to enforce this behavior and add tax to the cognitive burden of an interface.

As a caveat,… sometimes, this cannot be so easily handled and parameterization just remains a sticking point. But where we can enforce valid parameter combinations, I think it’s best to try.

AndrewCodeDev · January 14, 2024, 10:50pm

One more addendum here… when I say that @inComptime can be a Zig code smell, I’m not saying that there is never a situation that warrants it. For instance, take @constCast. Many people see casting const away as an absolute code smell, but there are uses for it. Here’s an example from the Allocator.destroy function in the standard library:

pub fn destroy(self: Allocator, ptr: anytype) void {
    const info = @typeInfo(@TypeOf(ptr)).Pointer;
    if (info.size != .One) @compileError("ptr must be a single item pointer");
    const T = info.child;
    if (@sizeOf(T) == 0) return;
    const non_const_ptr = @as([*]u8, @ptrCast(@constCast(ptr)));
    self.rawFree(non_const_ptr[0..@sizeOf(T)], log2a(info.alignment), @returnAddress());
}

We can see the use of it in creating the non_const_ptr variable. So when I say code smell, what I mean here is that we should really consider other options first before reaching for that utility (not that it can’t be used). It’s often an indicator that there’s a problem somewhere else that needs to be addressed.

timfayz · January 15, 2024, 9:06am

Totally agree. Thank you for the revised version. I was just thinking maybe there are some Zig tricks I’m not aware of that would make this combining smooth. It seems there aren’t, at least the obvious ones.

Great perspective. Feels like a science class. I’d definitely take it as a takeaway.

Yeah, I’ve generally learned this intrinsic is in a sort of “grey zone”. I read this proposal: Add ability to determine if we are currently executing at compile time · Issue #868 · ziglang/zig · GitHub and some people were even against introducing it.

Overall, thank you guys. It’s getting better now.

timfayz · January 15, 2024, 9:40am

I’m currently a bit confused about the use (or abuse) of comptime – whether to include it everywhere (in the function signature and body) or simply prepend it at the caller site. Consider the same addSepCT in two versions:

fn argsStructLen(comptime args: anytype) comptime_int {
    const args_T = @TypeOf(args);
    const args_T_info = @typeInfo(args_T);
    if (args_T_info != .Struct) {
        @compileError("expected tuple or struct, found " ++ @typeName(args_T));
    }
    return args_T_info.Struct.fields.len;
}

// This function enforces comptime mode on the callee site.
pub fn addSepCT1(comptime sep: []const u8, comptime args: anytype) []const u8 {
    // if (!@inComptime()) @compileError("Must be called at comptime");
    const args_len = argsStructLen(args);
    const items: [args_len][]const u8 = args;

    comptime var out: []const u8 = "";
    inline for (items) |field| {
        if (field.len == 0) continue;
        out = out ++ field ++ sep;
    }

    return out[0..out.len -| sep.len];
}

// This function uses @inComptime to enforce comptime mode on the caller site.
pub fn addSepCT2(sep: []const u8, args: anytype) []const u8 {
    if (!@inComptime()) @compileError("Must be called at comptime");
    const args_len = argsStructLen(args);
    const items: [args_len][]const u8 = args;

    var out: []const u8 = "";
    for (items) |field| {
        if (field.len == 0) continue;
        out = out ++ field ++ sep;
    }

    return out[0..out.len -| sep.len];
}

pub fn main() !void {
    _ = addSepCT1("", .{}); // fine w/o the keyword
    _ = comptime addSepCT2("", .{}); // works only with
}

It’s interesting to note that even though I’m 99% sure that addSepCT1 runs in comptime mode (because everything in it is implicitly forced to be so), the @inComptime conditional fails, so I had to comment it out. Why is that? Or, put another way, is it more preferable to use comptime explicitly in front of the call or design the function itself to do so?

IntegratedQuantum · January 15, 2024, 11:08am

Just because all parameters are marked as comptime, that doesn’t mean that all the code inside will be comptime as well. For example addSepCT1 will still result in a runtime function call to a function that looks roughly like this:

fn addSepCT1__anon_472() []const u8 {
    return "";
}
pub fn main() !void {
    _ = addSepCT1__anon_472(); // the compiler still assumes this is a runtime function, despite all parameters being passed at compile time
    _ = ""; // Here the compiler calculated the result directly
}

The best way to ensure that a function gets evaluated fully at compile time without the keyword at the callsite(is it really that bad though?) would be to inline it(to prevent the compiler from calling it as a separate function) and force the interior in a comptime block as well(so you don’t accidently make runtime computations in there):

inline fn fullyComptime(comptime params...) ReturnType {
    return comptime {
        ...
    };
}

bnl1 · January 15, 2024, 7:21pm

This is not necessarily true. Sure, allocators like heap_allocator obviously can’t run at compile time, but it is possible to implement std.mem.Allocator that works exclusively in comptime (and is even usable for some data structures, like ArrayList and json parsing but doesn’t work for HashMap because of pointer casting, as far as I can tell).
edit: of course for OP’s case, it is invalid because allocator isn’t even used in comptime.

Cloudef · January 16, 2024, 2:31am

This issue is sadly right now blocking comptime allocators

github.com/ziglang/zig

Making (FixedBuffer)Allocator available at comptime

opened 02:07AM - 16 Mar 23 UTC

rsepassi

standard library

### Zig Version 0.11.0-dev.1975+e17998b39 ### Steps to Reproduce and Observed …Behavior ```zig test "comptime alloc" { comptime { var buf: [1]u8 = undefined; var fba = std.heap.FixedBufferAllocator.init(&buf); var allocator = fba.allocator(); var x = allocator.create(i8) catch unreachable; _ = x; } } ``` ``` .../std/mem/Allocator.zig:105:65: error: unable to evaluate comptime expression const slice = try self.allocAdvancedWithRetAddr(T, null, 1, @returnAddress()); ^~~~~~~~~~~~~~~~ ...:11:29: note: called from here var x = allocator.create(i8) catch unreachable; ~~~~~~~~~~~~~~~~^~~~ ``` ### Expected Behavior Expected FixedBufferAllocator to work at comptime, which was suggested in this old issue https://github.com/ziglang/zig/issues/5873#issuecomment-760620505. Goal is to be able to call code at comptime that expects an Allocator. Seemed sensible to use a FBA, but breaks because of the use of `@returnAddress()`. The impl of FBA does not use the return address passed down from Allocator ([code](https://github.com/ziglang/zig/blob/a2c6ecd6dc0bdbe2396be9b055852324f16d34c9/lib/std/heap.zig#L417)).

I’m writing FSM compiler where above would be useful GitHub - Cloudef/zig-fsm-compiler: Ragel compatible FSM compiler for Zig

AndrewCodeDev · January 16, 2024, 4:01pm

Correct, I was only referring to the OP’s case.