I’d argue that’d be learning the wrong lesson. IMO the lesson is really “you need to think things through when it comes to paths”; making any blind choice can and likely will lead to problems.
My full thoughts can be found here:
I don’t think this is the correct way to look at it. Instead, I’d say string literals in Zig are arbitrary sequences of bytes, and Zig makes it convenient to create UTF-8 encoded string literals.
For example, take @embedFile :
@embedFile(comptime path: []const u8) *const [N:0]u8
This function returns a compile time constant pointer to null-terminated, fixed-size array with length equal to the byte count of the file given by path. The contents of the array are the contents of the file. This i…
This bit is particularly relevant:
[…] there is no canonical/portable way to format an arbitrary path as valid UTF-8 (i.e. invalid UTF-8 sequences can be converted into � using a variety of algorithms, but the user cannot ever use that output to reconstruct the actual path)
For Zig, in #19005 I added std.path.fmtAsUtf8Lossy and std.path.fmtWtf16LeAsUtf8Lossy.
However, that may not be the way you want to print paths depending on your use case. For example, ls on Linux prints them shell-escaped:
$ touch `echo 'FF FF FF FF' | xxd -r -p`
$ ls
''$'\377\377\377\377'
2 Likes