I’m trying to implement addSep function that given addSep("=", .{ "A", "B", "C" })} can produce "A=B=C" during comptime. Here is the sketch:
const std = @import("std");
pub fn addSep(comptime sep: []const u8, args: anytype) []const u8 {
_ = sep;
const args_T = @TypeOf(args);
const args_T_info = @typeInfo(args_T);
if (args_T_info != .Struct) {
@compileError("expected tuple or struct, found " ++ @typeName(args_T));
}
const args_len = args_T_info.Struct.fields.len;
const fields = @as([args_len][]const u8, args);
// const fields: [args_len][]const u8 = args; // same?
comptime var res = "";
var i: usize = 0;
while (i < args_len) : (i += 1) { // same error using `for (fields) |field| ...`
// here, I wished to get something like: res = res ++ fields[i];
std.log.debug("{any}", .{fields[i]});
}
return res;
}
pub fn main() !void {
std.log.debug("{s}", .{comptime addSep("=", .{ "A", "B" })});
}
Running the above gives:
/Users/timfayz/.zig/lib/std/Thread.zig:600:45: error: comptime call of extern function
assert(c.pthread_threadid_np(null, &thread_id) == 0);
~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
/Users/timfayz/.zig/lib/std/Thread.zig:281:29: note: called from here
return Impl.getCurrentId();
~~~~~~~~~~~~~~~~~^~
/Users/timfayz/.zig/lib/std/Thread/Mutex.zig:82:47: note: called from here
const current_id = Thread.getCurrentId();
~~~~~~~~~~~~~~~~~~~^~
/Users/timfayz/.zig/lib/std/Thread/Mutex.zig:46:19: note: called from here
self.impl.lock();
~~~~~~~~~~~~~~^~
/Users/timfayz/.zig/lib/std/log.zig:152:36: note: called from here
std.debug.getStderrMutex().lock();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
/Users/timfayz/.zig/lib/std/log.zig:125:22: note: called from here
std.options.logFn(message_level, scope, format, args);
~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/timfayz/.zig/lib/std/log.zig:197:16: note: called from here
log(.debug, scope, format, args);
~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.notes/q_comptimeConcat.zig:16:22: note: called from here
std.log.debug("{any}", .{fields[i]});
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
.notes/q_comptimeConcat.zig:25:43: note: called from here
std.log.debug("{s}", .{comptime addSep("=", .{ "A", "B" })});
~~~~~~^~~~~~~~~~~~~~~~~~~~
referenced by:
callMain: /Users/timfayz/.zig/lib/std/start.zig:585:32
initEventLoopAndCallMain: /Users/timfayz/.zig/lib/std/start.zig:519:34
remaining reference traces hidden; use '-freference-trace' to see all reference traces
Not sure what this error is about and how to proceed.
EDIT: Others beat me to it, hopefully there’s still something useful below.
You’re trying to print to stderr during comptime, which isn’t allowed.
In general, stuff like this is usually done by:
Counting the amount of space you will need
Creating an array with the appropriate amount of space
Populating the array
So in your case you’d want to do something like:
const std = @import("std");
pub fn addSep(comptime sep: []const u8, comptime args: anytype) []const u8 {
const args_T = @TypeOf(args);
const args_T_info = @typeInfo(args_T);
if (args_T_info != .Struct) {
@compileError("expected tuple or struct, found " ++ @typeName(args_T));
}
if (args.len == 0) return &[_]u8{};
var bytes_needed = sep.len * (args.len - 1);
for (args) |arg| {
bytes_needed += arg.len;
}
var result: [bytes_needed]u8 = undefined;
@memcpy(result[0..args[0].len], args[0]);
var byte_i = args[0].len;
var arg_i: usize = 1;
while (arg_i < args.len) : (arg_i += 1) {
const arg = args[arg_i];
@memcpy(result[byte_i..][0..sep.len], sep);
byte_i += sep.len;
@memcpy(result[byte_i..][0..arg.len], arg);
byte_i += arg.len;
}
return &result;
}
test {
const joined = comptime addSep("=", .{ "A", "B", "C" });
try std.testing.expectEqualStrings("A=B=C", joined);
}
Note that the return type isn’t the same type as a regular string literal, @compileLog(@TypeOf("A=B=C")); is *const [5:0]u8. If you want to make addSep return something that matches a string literal, check out std.unicode.utf8ToUtf16LeStringLiteral to see how that can be done:
It is very useful! So thank you for adding this up. I didn’t reach the @memcpy stuff yet but you showed a good example how to use it the way I didn’t know.
The only thing I didn’t get is your suggestion on using UTF16. From what I’ve gathered, a string literal in Zig land is a slice of bytes (I think it’s not wrong to say that it is plain ASCII). String literals in other languages (and almost everywhere I’ve seen) are UTF8. Unfortunately, I didn’t reach the “working with strings” part in Zig but I assume when I reach it, I’d be dealing with UTF8 and primitives given by Zig std, wouldn’t I?
var result: [bytes_needed]u8 = undefined;
...
return &result;
Also, I still don’t get it how you guys predict whether things are “interned” and produce stable pointers to “.rodata” vs. pointers from the function stack that’s gonna be invalidated and lead to segfault. I often see people rely on that behaviour but I never understood how. Idk, may it’s a subject for a separate question…
It’s easy! You just need to figure out whether a string is comptime or not. Being comptime is the same as having a static lifetime. So string literals and any string that you generate in comptime will be interned.
When I was speaking about pointers I meant pointers to data on the stack in general Probably it’s a bit silly question, since u8 already suggests that it could be anything but who knows… So, does static lifetime apply only to string literals ([]const u8 / []u8) or it can be used with other types as well?
In terms of returning pointers from functions, string literals are the only option, ie returning pointers to values of any other types isn’t valid, unless you’ve heap allocated them, of course. Other than that, any global value will have a static lifetime, too.
TBH, still don’t get it . In the example given by @squeek502 above, I don’t see any evidence that we return a string literal from the function. Or maybe Zig is smart enough to see that since we passed string literals to a function with anytype, it should be specialized with []const u8, and that is what leads to “interning” when final types become obvious. Idk…
I don’t think that is true, unless I am misunderstanding you. I will caveat that I don’t have any knowledge of the compiler inner workings, just experience reading and writing Zig, but I believe that all comptime variables have static lifetimes. It should be valid to return a reference to a local comptime variable of any type.
It’s very important to still remember that comptime data does not have reliable memory addresses at runtime. So yes, they are static in the sense that the value will be guarenteed, but the variable itself can essentially be expunged.
Meanwhile there are static variables that are not const nor comptime (as in the case of a var declaration in a struct) that do have static lifetimes as well. I just want to make sure that we don’t accidentally imply that static == comptime/const. That aside, everything here is up to par with your usual standard of high quality
I wonder why Zig didn’t simply introduce a some kind of static keyword (and by default throwing an error if something from within function scope [ie. a pointer] is getting out) to make things explicit rather than beating around the bush and guessing whether something is static or not?
I would agree if things were more complex but the rules around static lifetimes are (thus far) very simple once you’re familiar.
First, recall that everything in Zig ends up in some kind of struct. Files are structs (that’s why we can import them and use them like structs). So when you’re working in a file, you’re literally working in a big struct (You can use the @This() to get the struct type, you can declare member veriables, etc).
So anything that you define as a var outside of a function at file level (aka, in the struct you’re working with) becomes a struct declaration. That type now has a variable that can be referenced in the form of T.my_var. That has static lifetime because it belongs to the type, not an instance of a type.
Const is the same thing but it is considered comptime-friendly data. Anything used in a comptime context is evaluated before your program is actually executed. Because of that, it doesn’t share the runtime memory space.
Sorry, I was only trying to draw attention to the return type of that function (a pointer to an array with a NUL sentinel) and how it is able to return that type since the length of an array has to be comptime known. To do so, it moves the length calculation into a function and calls it when constructing the return type. So, for addSep, that’d look like:
With this, @TypeOf(joined) would == @TypeOf("A=B=C") in the test block (and also means that joined could be passed into a C function that expects a NUL-terminated string).
It’s very important to still remember that comptime data does not have reliable memory addresses at runtime
What does “reliable” refer in “reliable memory addresses”?
So anything that you define as a var outside of a function…
So can I conclude the following rules to ensure something is static?
If data defined as var outside a function scope, it automatically becomes part of a type definition (struct) and thus becomes static.
If data defined as const, it is static as soon as it is referred somewhere regardless of whether it is within or outside the function scope. (here I think a certain pitfall is hidden with regards to stable / reliable memory addresses)