"null" bytes outputted when building formatted string

Hey gang,

I’m still just getting started with Zig but I noticed something odd about some formatting code I wrote. I figured it would be worth sharing to see what you all think.

Given the following code:

const std = @import("std");
pub fn main() void {
    // (negative example) works fine
    std.log.info("{s}", .{"Testing"});

    // build a string from a pre-defined buffer
    var str : [32]u8 = undefined;
    _ = std.fmt.bufPrint(str[0..], "Testing", .{}) catch {};

    // (positive example) 0xaa chars included
    std.log.info("{s}", .{str});
}

We get the following output when piped to hexdump. (Notice all the aa bytes)

$ zig build-exe test.zig
$ ./test 2>&1 | hexdump -C
00000000  69 6e 66 6f 3a 20 54 65  73 74 69 6e 67 0a 69 6e  |info: Testing.in|
00000010  66 6f 3a 20 54 65 73 74  69 6e 67 aa aa aa aa aa  |fo: Testing.....|
00000020  aa aa aa aa aa aa aa aa  aa aa aa aa aa aa aa aa  |................|
00000030  aa aa aa aa 0a                                    |.....|
00000035

Essentially, this seems like formatted string parameters will always output the full length of the given string. For const strings this would be totally fine because the end of the array is the end of the string, but for dynamic ones it can lead to extra output.

I kinda wonder if this is something that is just overlooked because non-unicode terminals completely skip these characters.

So the question is, should these formatting functions check for these aa bytes and consider that the end of string? Or is this method of constructing strings just a bad practice that should be avoided?

std.fmt.bufPrint prints to the buffer you pass in, and returns a slice of that buffer that contains what was actually printed. In your code, you’re ignoring this returned slice and instead printing the full buffer str, so Zig does exactly that. Check out this version:

const std = @import("std");

pub fn main() void {
    // (negative example) works fine
    std.log.info("{s}", .{"Testing"});

    // build a string from a pre-defined buffer
    var str: [32]u8 = undefined;
    const out = std.fmt.bufPrint(&str, "Testing", .{}) catch unreachable;

    // (positive example) 0xaa chars included
    std.log.info("{s}, {}", .{ str, str.len });
    std.log.info("{s}, {}", .{ out, out.len });
}

output:

info: Testing
info: Testing, 32
info: Testing, 7
1 Like

Ahh, that makes sense. I thought bufPrint only returned the length of what it changed, not a slice of the updated string.

That seems smarter, just a little unintuitive compared to the standard null-terminated strings we usually see.

Yeah, all things strings in Zig may take a little getting used to, but once you get the hang of it, there’s a lot of power and flexibility in the way Zig does it. To make the intent more clear, I would recommend renaming the variables like so:

var buf: [32]u8 = undefined;
const str = std.fmt.bufPrint(&buf, "Testing", .{}) catch unreachable;
1 Like

I actually solved my original problem by circumventing the local buffer entirely. Now I just merge the format strings and arguments at compile-time and feed it all to a single call. That’s what I wanted initially, but there’s a lot of syntax to search through.

Here’s what my code looks like now.

const std = @import("std");
const SourceLocation = std.builtin.SourceLocation;

pub inline fn info(
    comptime src: SourceLocation,
    comptime msgFormat: []const u8,
    msgArgs:anytype) void
{
    const fullFormat = "[{s}:{d}] " ++ msgFormat;
    var fullArgs = .{std.fs.path.basename(src.file), src.line} ++ msgArgs;
    std.log.info( fullFormat,  fullArgs);
}

pub fn main() void {
    info(@src(), "Hello {s}", .{"Zig!"});
    // outputs: info: [test2.zig:15] Hello Zig!
}
4 Likes

@GarrettTypes in case it is of interest to you at all, I’d also like to point out the existence of std.fmt.comptimePrint.

3 Likes