Help converting u8 to string

Beginner here, not looking for a ready-made answer but rather suggestions as to where in the std (std.fmt?) documentation I might seek an answer.

I’m writing a CLI program (Windows) that takes a filename as its argument. It reads the file into a variable (well, a constant) and then iterates through the bytes and, for each byte, writes one of two things to screen. If it’s in the range 32-128, it writes the Ascii character; if it’s anything else (for present purposes, a 1-byte Ascii char in the range 0-31 or 129-255) I want to write the following string:

"{" + %03d + "}"

where “%03d” represents the char’s Ascii value padded to 3 decimal places. Any pointers will be greatly appreciated.

You will find your answer in the doc comments of std.fmt.format.

2 Likes

Thanks, will have a look-see.

Not sure if you wanted advice just on the format string, or on the iteration and stream-handling too? Here is the link to std.fmt.format (from master branch) that includes the description of the format string:

The format string must be comptime-known and may contain placeholders following this format: {[argument][specifier]:[fill][alignment][width].[precision]}

[…]

  • d: output numeric value in decimal notation

[…]

To print literal curly braces, escape them by writing them twice, e.g. {{ or }}.

I’m new to Zig too (so not sure if it’s perfect style / idiomatically written), but I gave it a try, including handling generic streams and buffering:

const std = @import("std");

pub fn writePrintable(output: anytype, input: []const u8) !void {
    for (input) |byte| {
        if (byte >= 32 and byte <= 126) {
            try output.writeByte(byte);
        } else {
            try output.print("{{{d:03}}}", .{byte});
        }
    }
}

pub fn main() !void {
    var stdout_bufwriter = std.io.bufferedWriter(std.io.getStdOut().writer());
    var stdout = stdout_bufwriter.writer();
    try writePrintable(stdout, "abc\ndef\n");
    try stdout.print("\n", .{});
    try stdout_bufwriter.flush();
}

Output:

abc{010}def{010}

(using Zig 0.15.0-dev.847+850655f06)

Thanks, @jbe. Here’s what I came up with – I needed to save the resulting string to a variable:

const std = @import("std");

pub fn main() !void {
    var buffer: [5]u8 = undefined; // to hold "{nnn}"
    var char_in: u8 = undefined;
    var charnum: []u8 = undefined; // Ascii number of char_in as "{nnn}"

   char_in = 191; // for example
   charnum = try std.fmt.bufPrint(&buffer, "{{{d:0>3}}}", .{char_in});

   std.debug.print("{s}\n", .{charnum});
}

I was playing around a bit and wondered if and how Zig’s current I/O interface convention and implementation allows using my generic writePrintable function from above to store the result in a string, instead.

I don’t know in which context (for which scope) you want to write into buffers, so not sure if the following is of any interest for you. It’s also quite clunky, and I just mean to share it for sake of demonstration, and not because I think it’s good to do it this way:

pub fn main() !void {
    var char_in: u8 = undefined;
    var output: [5]u8 = undefined; // Ascii number of char_in as "{nnn}"

    char_in = 191; // for example
    {
        var output_writer = std.io.fixedBufferStream(&output);
        try writePrintable(output_writer.writer(), (&char_in)[0..1]);
    }
    std.debug.print("{s}\n", .{output});
}

Some unwieldy things:

  • The FixedBufferStream returned by fixedBufferStream is not a std.io.GenericWriter. To obtain one, we need to make output_writer a variable and access its writer method.
  • My writePrintable function takes a slice as input. If you have a single u8, it seems to be possible to create a one-element slice with the (&char_in)[0..1] syntax.

Also, I assume some unnecessary runtime data is created here, like the input slice’s length (which is always one), or the FixedBufferStream’s current writing position, which is not really needed anymore after the write if only one encoded character is written into the buffer.

As far as I understood the Zig Showtime episode #41, the I/O interface might undergo some deep changes with the reintroduction of async/await in future Zig versions.

1 Like

I finished the program that prompted my original question in this thread and thought I’d share it here. xencode.exe takes a binary file, or any file, and encodes it in readable plain text. Notably, high- and low-order Ascii chars are encoded as 3-digit decimal numbers enclosed in curly braces {nnn}. There is an option -a that only displays the plain-text characters in the file, and an option -x that applies readability aids to the encodings of files and programs produced by the old DOS word-processor XyWrite. We devised this system to make it easy to share files and macros on a plain-text mailing list. It’s quirky, but I believe it has its uses beyond the original narrow purpose. The next step will be to write the companion program, xdecode.exe, which will decode encoded output back to the original. I did this as an exercise for learning some Zig. It was fun and I think I did learn a lot.

// xencode.zig: Human-readable plain-text encoding for binary files
// CLD rev. 2025-07-09_23:25

const std = @import("std");
const builtin = @import("builtin");
var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
const ap_allocator = arena.allocator();
var tw: u8 = 65; // text width, output
var twct: u8 = 1; // tw counter
var alist = std.ArrayList(u8).init(ap_allocator);

pub fn main() !void {
	const stdout = std.io.getStdOut().writer();
	defer arena.deinit();
	// defer alist.deinit();
	const args = try std.process.argsAlloc(ap_allocator);
	if (args.len < 2) {
		showHelp();
		return;
	}
	var omode: i8 = 0; // output mode: -1=printable only,
                     // 0=suppress XyWrite readability aids
	var i: u8 = 1;
	var eval_arg = std.mem.orderZ(u8, "-a", "-x");
	var byte2: u16 = undefined;
	var byte3: u16 = undefined;
 	const tab = [_]u8{'{', 't', 'a', 'b', '}'};
 	const tabs = tab[0..];
 	const crlf = [_]u8{'[', 'c', 'r', '|', 'l', 'f', ']'};
	const crlfs = crlf[0..];
	const lbrc = [_]u8{'{', '0', '9', '1', '}'};
	const lbrcs = lbrc[0..];
	const rbrc = [_]u8{'{', '0', '9', '3', '}'};
	const rbrcs = rbrc[0..];
	const lguil = [_]u8{'{', '<', '}'};
	const lguils = lguil[0..];
	const rguil = [_]u8{'{', '>', '}'};
	const rguils = rguil[0..];
 	const spce = [_]u8{'{', '0', '3', '2', '}'};
	const spces = spce[0..];
	var ctr: u8 = 0;
	var n: u32 = 0;

// Poll command-line options (all -# option args must come first)
	while (i < args.len) {
		eval_arg = std.mem.orderZ(u8, args[i], "-?");
		if (eval_arg == .eq) {
			showHelp();
			return;
		}
		eval_arg = std.mem.orderZ(u8, args[i], "/?");
		if (eval_arg == .eq) {
			showHelp();
			return;
		}
		eval_arg = std.mem.orderZ(u8, args[i], "--help");
		if (eval_arg == .eq) {
			showHelp();
			return;
		}
		eval_arg = std.mem.orderZ(u8, args[i], "-a");
		if (eval_arg == .eq) {
			omode = -1;
			i += 1;
			continue;
		}
		eval_arg = std.mem.orderZ(u8, args[i], "/a");
		if (eval_arg == .eq) {
			omode = -1;
			i += 1;
			continue;
		}
		eval_arg = std.mem.orderZ(u8, args[i], "-x");
		if (eval_arg == .eq) {
			omode = 1;
			i += 1;
			continue;
		}
		eval_arg = std.mem.orderZ(u8, args[i], "/x");
		if (eval_arg == .eq) {
			omode = 1;
			i += 1;
			continue;
		}
		eval_arg = std.mem.orderZ(u8, args[i], "-w");
		if (eval_arg == .eq) {
			if (i + 1 <= args.len - 1) {
		    const twa = try std.fmt.parseInt(u8, args[i + 1], 10);
				tw = twa;
				i += 2;
				continue;
			}
		}
		eval_arg = std.mem.orderZ(u8, args[i], "/w");
		if (eval_arg == .eq) {
			if (i + 1 <= args.len - 1) {
		    const twa = try std.fmt.parseInt(u8, args[i + 1], 10);
				tw = twa;
				i += 2;
				continue;
			}
		}
		break;
	}
	if (i > args.len - 1) {
		showHelp();
		return;
	}
	const file_in: []u8 = args[i];
	std.fs.cwd().access(file_in, .{}) catch |err| {
		switch (err) {
			error.FileNotFound => {
				std.debug.print("File not found: \"{s}\"\n", .{file_in});
				return;
			},
			else => { // unreachable
				std.debug.print("Unexpected: {}\n", .{err});
				return;
			}
		}
	};
	var file_ex: []u8 = file_in;
	if (i + 1 == args.len - 1) {
		i += 1;
		file_ex = args[i];
	}
	const data: []u8 = try readFile(file_in);
	const header1 = "XPLeNCODE v2.0 (xencode.exe)";
	const header2 = "b-gin [UNTITLED]";
	for (header1[0..]) |byte| {
		try alist.append(byte);
	}
	try addNewline();
	for (header2[0..]) |byte| {
		try alist.append(byte);
	}
	try addNewline();
	const footer1 = "-nd";
	const footer2 = "XPLeNCODE";
	while (n < data.len) {
		if (omode < 0) {
			if (data[n] > 32 and data[n] < 127) {
				try writeByteLnBk(data[n]);
			}
			else if (data[n] == 32) {
				if (twct > 1 and twct < tw - 1) try writeByteLnBk(' ')
					else try writeAllLnBk(spces);	
			} else {
				try writeByteLnBk('.');
			}
			n += 1;
			continue;
		}
		ctr = 0;
		try switch (data[n]) {
			9 => {
				switch (omode) {
					0 => try putCharNumBr(9),
					1 => try writeAllLnBk(tabs),
					else => try writeByteLnBk('.'),
				}
			},
			13 => {
				if (omode == 0) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				if (data[n + 1] == 10) {
					try writeAllLnBk(crlfs);
					n += 1; } else try putCharNumBr(data[n]);
			},
			32 => {
				if (twct > 1 and twct < tw - 1) try writeByteLnBk(' ')
					else try writeAllLnBk(spces);	
			},
			33...90 => try writeByteLnBk(data[n]),
			91 => {
				if (omode > 0) try writeAllLnBk(lbrcs)
					else try writeByteLnBk(data[n]);
			},
			92 => try writeByteLnBk(data[n]),
			93 => {
				if (omode > 0) try writeAllLnBk(rbrcs)
					else try writeByteLnBk(data[n]);
			},
			94...122 => try writeByteLnBk(data[n]),
			123 => putCharNumBr(data[n]),
			124 => try writeByteLnBk(data[n]),
			125 => putCharNumBr(data[n]),
			174 => {
				if (omode > 0) try writeAllLnBk(lguils)
					else try putCharNumBr(data[n]);
			},
			175 => {
				if (omode > 0) try writeAllLnBk(rguils)
					else try putCharNumBr(data[n]);
			},
			254 => { // Speedo charset
				if (omode < 1) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				if (data.len - n < 3) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				byte2  = data[n + 1];
				byte3  = data[n + 2];
				if (byte3 < 2 or (byte3 < 3 and byte2 < 232)) {
					try putXySpeedoChar(byte2, byte3);
					n += 3;
				} else {
 				// Generic 3-byter [254+nnn+nnn]
					try put3byterBr(254, byte2, byte3);
					n += 3;
					continue;
				}
			},
			255 => {
				if (omode < 1) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				if (data.len - n < 3) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				byte2  = data[n + 1];
				byte3  = data[n + 2];
				// 3-byte functions
				if (byte2 > 127 and byte2 < 131 and byte3 % 2 == 1) {
						try putXyFunc(byte2, byte3);
						n += 3;
						continue;
				}
				// Search wildcards
				else if ((byte2 == 192 and (byte3 == 145 or byte3 == 153 or byte3 == 155 or byte3 == 173 or byte3 == 174 or (byte3 >= 176 and byte3 <= 185) or byte3 == 193 or byte3 == 204 or byte3 == 206 or byte3 == 207 or byte3 == 211 or byte3 == 215 or byte3 == 216)) or (byte2 == 193 and (byte3 == 46 or byte3 == 47))) {
					try putXyWildcard(byte3);
					n += 3;
					continue;
				} else {
 				// Generic 3-byter [255+nnn+nnn]
					try put3byterBr(255, byte2, byte3);
					n += 3;
					continue;
				}
			},
			else => try putCharNumBr(data[n]),
		};
		n += 1;
	}
	if (twct > 1) try addNewline();
	for (footer1[0..]) |byte| {
		try alist.append(byte);
	}
	try addNewline();
	for (footer2[0..]) |byte| {
		try alist.append(byte);
	}
	try addNewline();
	if (!std.mem.eql(u8, file_ex, file_in)) {
		try writeFile(file_ex, alist.items);
	} else {
		for (alist.items) |byte| {
			try stdout.writeByte(byte);
		}
	}
}

pub fn showHelp() void {
	std.debug.print("Human-readable Plain-text Encoding for Binary Files\n                              [CLD rev. 2025-07-09]\n\nUsage (optional arguments first):\nxencode [-a|-x] [-w <number>] file_in [file_out] | [-?]\n\nIf file_out is omitted, output is directed to stdout.\n\nOptions:\n  -a outputs Ascii 32-126 only (not decodable)\n  -x applies XyWrite readability aids to output\n  -w <number> changes text width of output to <number>\n     (default = 65 characters per line)\n  -? shows this help\n", .{});
}

pub fn readFile(fname: []const u8) ![]u8 {
	const fileContents = try std.fs.cwd().readFileAlloc(
		ap_allocator,
		fname,
		std.math.maxInt(u32));
	return fileContents;
}

pub fn writeFile(file_path: []const u8, data: []const u8) !void {
    // Open the file for writing
    const file = try std.fs.cwd().createFile(file_path, .{});
    defer file.close();

    // Write the data to the file
    try file.writeAll(data);
}

pub fn writeByteLnBk(byte: u8) !void {
	try alist.append(byte);
	twct += 1;
	if (twct > tw) {
		if (builtin.target.os.tag == .windows) try alist.append(13);
		try alist.append(10);
		twct = 1;
	}
}

pub fn writeAllLnBk(bytes: []const u8) !void {
	for (bytes) |b| try writeByteLnBk(b);
}

pub fn addNewline() !void {
	if (builtin.target.os.tag == .windows) try alist.append(13);
	try alist.append(10);
}

pub fn putCharNumBr(char_in: u8) !void {
	var buffer: [5]u8 = undefined;
	const charnum: []u8 = try std.fmt.bufPrint(
		&buffer,
		"{{{d:0>3}}}",
		.{char_in});
	try writeAllLnBk(charnum);
}

pub fn put3byterBr(byte1: u16, byte2: u16, byte3: u16) !void {
	var buffer: [13]u8 = undefined;
	const brace3: []u8 = try std.fmt.bufPrint(
		&buffer,
		"[{d:0>3}+{d:0>3}+{d:0>3}]", 
		.{byte1, byte2, byte3});
	try writeAllLnBk(brace3);
}

pub fn putXyFunc(byte_2: u16, byte_3: u16) !void {
	const index: u32 = (256 * (byte_2 % 128) / 2) + (byte_3 / 2);
	var afunc: [5]u8 = undefined;
	const func_no = [_]u8{'[', '2', '5', '5', '+', '1', '2', '9', '+',
		 '1', '6', '3', ']'};
	const func_nos = func_no[0..];
	const xyfuncs = [_]*const [2:0]u8{
"@0", "@1", "@2", "@3", "@4", "@5", "@6", "@7", "@8", "@9", "@A", "@B",
"@C", "@D", "@E", "@F", "@G", "@H", "@I", "@J", "@K", "@L", "@M", "@N",
"@O", "@P", "@Q", "@R", "@S", "@T", "@U", "@V", "@W", "@X", "@Y", "@Z",
"AD", "AS", "BF", "BK", "BS", "CC", "CD", "CH", "CI", "CL", "CM", "CN",
"CP", "CR", "CS", "CU", "DC", "DF", "GH", "DL", "DP", "DS", "DW", "EL",
"ER", "EX", "GT", "HM", "M0", "M1", "M2", "M3", "M4", "M5", "M6", "M7",
"M8", "MD", "MU", "MV", "NC", "NL", "NK", "NP", "NR", "NS", "NT", "NW",
"PC", "PD", "PL", "PP", "PR", "PS", "PT", "PU", "PW", "R0", "R1", "R2",
"R3", "R4", "R5", "R6", "R7", "R8", "R9", "RC", "RD", "RE", "RL", "RP",
"RS", "RV", "RW", "SD", "SH", "SI", "SK", "SM", "SN", "SS", "SU", "SV",
"TF", "TI", "TN", "TS", "UD", "WA", "WC", "WL", "WN", "WS", "WX", "WW",
"XC", "XD", "DT", "S1", "S2", "S3", "S4", "S5", "S6", "S7", "SP", "BC",
"LB", "LE", "NF", "PF", "TP", "BD", "MS", "NM", "LD", "LL", "LR", "LU",
"UP", "FF", "YD", "DO", "DX", "MK", "SO", "OP", "WZ", "NX", "SW", "FD",
"FM", "TL", "TR", "TE", "ED", "EE", "HC", "EC", "MC", "#1", "#2", "#3",
"#4", "#5", "#6", "#7", "#8", "#9", "$1", "$2", "$3", "$4", "$5", "$6",
"$7", "$8", "$9", "DR", "EN", "C0", "C1", "C2", "C3", "C4", "C5", "C6",
"C7", "C8", "C9", "EF", "IB", "NO", "NI", "CO", "$0", "LS", "XP", "WG",
"XM", "&0", "&1", "&2", "&3", "&4", "&5", "&6", "&7", "&8", "&9", "&A",
"&B", "&C", "&D", "&E", "&F", "&G", "&H", "&I", "&J", "&K", "&L", "&M",
"&N", "&O", "&P", "&Q", "&R", "&S", "&T", "&U", "&V", "&W", "&X", "&Y",
"&Z", "HL", "$A", "$B", "$C", "$D", "$E", "$F", "$G", "$H", "$I", "$J",
"$K", "$L", "$M", "$N", "$O", "$P", "$Q", "$R", "$S", "$T", "$U", "$V",
"$W", "$X", "$Y", "$Z", "XX", "H@", "VH", "MW", "QH", "DK", "SR", "SC",
"TG", "H1", "JH", "DZ", "DD", "DM", "LT", "RK", "NN", "MT", "ET", "ZT",
"T1", "TT", "<<", ">>", "IT", "SL", "SF", "FL", "FR", "FC", "SY", "ME",
"AC", "FS", "TW", "MI", "RO", "NB", "Q1", "Q2", "Q3", "Q4", "Q5", "Q6",
"Q7", "Q8", "TO", "IR", "AR", "AX", "DB", "DE", "HF", "SA", "OV", "TC",
"TB", "JM", "SG", "XH", "FT", "BX", "MN", "CB", "M9", "MZ", "ZZ", "RX",
"ST", "KF", "JC", "AK", "TM", "NU", "B4", "QP", "HG", "US", "XE", "ES",
"RB", "S-", "S+", "**", "BN", "RU", "CF", "UI", "XS", "EA", "BT", "KD",
"DN", "HI", "WH", "XN", "FX", "UN", "MX", "AZ", "BR", "HK", "#X", "??",
"BM", "JR", "XO", "XW", "TX", "LF", "LO", "BL", "XT", "WT", "IC", "CT",
"VB", "-D", "WD", "RM", "LM", "aL", "aR", "aB", "aE", "MP", "mN", "QL",
"QR", "MF"};
	if (index < xyfuncs.len) {
		afunc[0] = '[';
		afunc[1] = xyfuncs[index][0];
		afunc[2] = xyfuncs[index][1];
		if (afunc[1] == 'N' and afunc[2] == 'O') {
			try writeAllLnBk(func_nos); // workaround for flaky func NO
			return;
		}
		afunc[3] = '_';
		afunc[4] = ']';
		try writeAllLnBk(afunc[0..]);
	}
}

pub fn putXyWildcard(byte_3: u16) !void {
	var wild: [4]u8 = undefined;
	wild[0] = '[';
	wild[1] = 'w';
	var wild1: [5]u8 = undefined;
	wild1[0] = '[';
	wild1[1] = 'w';
	var wilddot = [_]u8{
		'[', '2', '5', '5', '+', '1', '9', '2', '+', '1', '7', '4', ']'};
	var c: u3 = 2;
	switch (byte_3) {
		46 => {wild[c] = '<'; c += 1;}, 
		47 => {wild[c] = '>'; c += 1;},
		145 => {wild1[c] = '1'; wild1[c + 1] = '3'; c += 2;},
		153 => {wild1[c] = '1'; wild1[c + 1] = '0'; c += 2;},
		155 => {wild[c] = 'C'; c += 1;}, 
		173 => {wild[c] = '-'; c += 1;}, 
		174 => {wild[c] = '.'; c += 1;}, 
		176 => {wild[c] = '0'; c += 1;}, 
		177 => {wild[c] = '1'; c += 1;}, 
		178 => {wild[c] = '2'; c += 1;}, 
		179 => {wild[c] = '3'; c += 1;}, 
		180 => {wild[c] = '4'; c += 1;}, 
		181 => {wild[c] = '5'; c += 1;}, 
		182 => {wild[c] = '6'; c += 1;}, 
		183 => {wild[c] = '7'; c += 1;}, 
		184 => {wild[c] = '8'; c += 1;}, 
		185 => {wild[c] = '9'; c += 1;}, 
		193 => {wild[c] = 'A'; c += 1;}, 
		204 => {wild[c] = 'L'; c += 1;}, 
		206 => {wild[c] = 'N'; c += 1;}, 
		207 => {wild[c] = 'O'; c += 1;}, 
		211 => {wild[c] = 'S'; c += 1;}, 
		215 => {wild[c] = 'W'; c += 1;}, 
		216 => {wild[c] = 'X'; c += 1;},
		else => wild[0] = 0,
	}
	if (wild[2] == '.') {
		try writeAllLnBk(wilddot[0..]);
		return;
	}
	if (c > 3) {
		wild1[c] = ']';
		try writeAllLnBk(wild1[0..]);
	} else {
		wild[c] = ']';
		try writeAllLnBk(wild[0..]);
	}
}

pub fn putXySpeedoChar(byte_2: u16, byte_3: u16) !void {
	const index: u32 = (256 * (1 + byte_3)) + byte_2;
	var buffer: [5]u8 = undefined;
	const speedo: []u8 = try std.fmt.bufPrint(
		&buffer,
		"[{d}]", .{index});
	try writeAllLnBk(speedo);
}

1 Like

Remember to use a buffered writer and flush it, to avoid making way too many syscalls, otherwise you are writing single bytes unbuffered.

1 Like

Something I noticed:

Why not:

const tabs: []const u8 = "{tab}";

or:

const tabs = "{tab}";

?

3 Likes

Thanks, this is just the kind of constructive criticism I hoped I’d get by posting my code: I’m a naive beginner when it comes to lower-level programming. Now I’ll have to be sensitive to syscalls, which, I take it, are expensive? Thanks again.

1 Like

Thanks, I’ll try this. What I ended up with was a workaround for my thin understanding of the various types associated with slices and strings. I’ve read various tutorials on it, but so far the stuff has not sunk in.

The usual way to do this is to use:

if (std.mem.eql(u8, args[i], "-?")) {
	showHelp();
	return;
}

Which compares two slices, it is also faster because orderZ has to go byte by byte, but eql can return false immediately if the slices have different lengths.

2 Likes

Makes sense and I’ve modified the code accordingly. Appreciate getting help from a moderator who has better things to do…! :wink:

Thanks again for all the suggestions. See further below for updated code. With regard to buffered writing, I want to highlight my revised function and ask if I’ve done it correctly. I note that what I’ve done here doesn’t specify the size of the buffer. Does that mean that a standard memory page is used? Would I be better off using a larger buffer?

pub fn fileBufWrite(file_path: []const u8, data: []const u8) !void {
	const file = try std.fs.cwd().createFile(file_path, .{});
	defer file.close();
	var bufwriter = std.io.bufferedWriter(file.writer());
	const writer = bufwriter.writer();
	try writer.print("{s}", .{data});
	try bufwriter.flush();
}

The updated code:

// xencode.zig: Human-readable plain-text encoding for binary files
// CLD rev. 2025-07-11_03:35

const std = @import("std");
const builtin = @import("builtin");
var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
const ap_allocator = arena.allocator();
var tw: u8 = 65; // text width, output
var twct: u8 = 1; // tw counter
var alist = std.ArrayList(u8).init(ap_allocator);

pub fn main() !void {
	defer arena.deinit();
	const stdout_unbuf = std.io.getStdOut().writer();
	var stdout_buffered = std.io.bufferedWriter(stdout_unbuf);
	const stdout = stdout_buffered.writer();
	const args = try std.process.argsAlloc(ap_allocator);
	if (args.len < 2) {
		showHelp();
		return;
	}
	var omode: i8 = 0; // output mode: -1=printable only,
                     // 0=suppress XyWrite readability aids
	var i: u8 = 1;
	var byte2: u16 = undefined;
	var byte3: u16 = undefined;
 	const tab: []const u8 = "{tab}";
	const crlf0: []const u8 = "[013+010]";
	const crlf: []const u8 = "[cr|lf]";
	const lbrc: []const u8 = "{091}";
	const rbrc: []const u8 = "{093}";
	const lguil: []const u8 = "{<}";
	const rguil: []const u8 = "{>}";
	const spce: []const u8 = "{032}";
	var ctr: u8 = 0;
	var n: u32 = 0;

// Poll command-line options (all -# option args must come first)
	while (i < args.len) {
		if (std.mem.eql(u8, args[i], "-?")) {
			showHelp();
			return;
		}
		if (std.mem.eql(u8, args[i], "/?")) {
			showHelp();
			return;
		}
		if (std.mem.eql(u8, args[i], "--help")) {
			showHelp();
			return;
		}
		if (std.mem.eql(u8, args[i], "-a")) {
			omode = -1;
			i += 1;
			continue;
		}
		if (std.mem.eql(u8, args[i], "/a")) {
			omode = -1;
			i += 1;
			continue;
		}
		if (std.mem.eql(u8, args[i], "-x")) {
			omode = 1;
			i += 1;
			continue;
		}
		if (std.mem.eql(u8, args[i], "/x")) {
			omode = 1;
			i += 1;
			continue;
		}
		if (std.mem.eql(u8, args[i], "-w")) {
			if (i + 1 <= args.len - 1) {
		    const twa = try std.fmt.parseInt(u8, args[i + 1], 10);
				tw = twa;
				i += 2;
				continue;
			}
		}
		if (std.mem.eql(u8, args[i], "/w")) {
			if (i + 1 <= args.len - 1) {
		    const twa = try std.fmt.parseInt(u8, args[i + 1], 10);
				tw = twa;
				i += 2;
				continue;
			}
		}
		break;
	}
	if (i > args.len - 1) {
		showHelp();
		return;
	}
	const file_in: []u8 = args[i];
	std.fs.cwd().access(file_in, .{}) catch |err| {
		switch (err) {
			error.FileNotFound => {
				std.debug.print("File not found: \"{s}\"\n", .{file_in});
				return;
			},
			else => { // unreachable
				std.debug.print("Unexpected: {}\n", .{err});
				return;
			}
		}
	};
	var file_ex: []u8 = file_in;
	if (i + 1 == args.len - 1) {
		i += 1;
		file_ex = args[i];
	}
	const data: []u8 = try fileRead(file_in);
	const header1 = "XPLeNCODE v2.0 (xencode.exe)";
	const header2 = "b-gin [UNTITLED]";
	for (header1[0..]) |byte| {
		try alist.append(byte);
	}
	try addNewline();
	for (header2[0..]) |byte| {
		try alist.append(byte);
	}
	try addNewline();
	const footer1 = "-nd";
	const footer2 = "XPLeNCODE";
	while (n < data.len) {
		if (omode < 0) {
			if (data[n] > 32 and data[n] < 127) {
				try writeByteLnBk(data[n]);
			}
			else if (data[n] == 32) {
				if (twct > 1 and twct < tw - 1) try writeByteLnBk(' ')
					else try writeAllLnBk(spce);	
			} else {
				try writeByteLnBk('.');
			}
			n += 1;
			continue;
		}
		ctr = 0;
		try switch (data[n]) {
			9 => {
				switch (omode) {
					0 => try putCharNumBr(9),
					1 => try writeAllLnBk(tab),
					else => try writeByteLnBk('.'),
				}
			},
			13 => {
				if (omode < 0) {
					try writeByteLnBk('.');
					n += 1;
					continue;
				}
				if (data[n + 1] == 10) {
					if (omode > 0) {
						try writeAllLnBk(crlf);
					} else {
						try writeAllLnBk(crlf0);
					}
					n += 1;
				} else {
					try putCharNumBr(data[n]);
				}
			},
			32 => {
				if (twct > 1 and twct < tw - 1) try writeByteLnBk(' ')
					else try writeAllLnBk(spce);	
			},
			33...90 => try writeByteLnBk(data[n]),
			91 => {
				if (omode > 0) try writeAllLnBk(lbrc)
					else try writeByteLnBk(data[n]);
			},
			92 => try writeByteLnBk(data[n]),
			93 => {
				if (omode > 0) try writeAllLnBk(rbrc)
					else try writeByteLnBk(data[n]);
			},
			94...122 => try writeByteLnBk(data[n]),
			123 => putCharNumBr(data[n]),
			124 => try writeByteLnBk(data[n]),
			125 => putCharNumBr(data[n]),
			174 => {
				if (omode > 0) try writeAllLnBk(lguil)
					else try putCharNumBr(data[n]);
			},
			175 => {
				if (omode > 0) try writeAllLnBk(rguil)
					else try putCharNumBr(data[n]);
			},
			254 => { // Speedo charset
				if (omode < 1) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				if (data.len - n < 3) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				byte2  = data[n + 1];
				byte3  = data[n + 2];
				if (byte3 < 2 or (byte3 < 3 and byte2 < 232)) {
					try putXySpeedoChar(byte2, byte3);
					n += 3;
				} else {
 				// Generic 3-byter [254+nnn+nnn]
					try put3byterBr(254, byte2, byte3);
					n += 3;
					continue;
				}
			},
			255 => {
				if (omode < 1) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				if (data.len - n < 3) {
					try putCharNumBr(data[n]);
					n += 1;
					continue;
				}
				byte2  = data[n + 1];
				byte3  = data[n + 2];
				// 3-byte functions
				if (byte2 > 127 and byte2 < 131 and byte3 % 2 == 1) {
						try putXyFunc(byte2, byte3);
						n += 3;
						continue;
				}
				// Search wildcards
				else if ((byte2 == 192 and (byte3 == 145 or byte3 == 153 or byte3 == 155 or byte3 == 173 or byte3 == 174 or (byte3 >= 176 and byte3 <= 185) or byte3 == 193 or byte3 == 204 or byte3 == 206 or byte3 == 207 or byte3 == 211 or byte3 == 215 or byte3 == 216)) or (byte2 == 193 and (byte3 == 46 or byte3 == 47))) {
					try putXyWildcard(byte3);
					n += 3;
					continue;
				} else {
 				// Generic 3-byter [255+nnn+nnn]
					try put3byterBr(255, byte2, byte3);
					n += 3;
					continue;
				}
			},
			else => try putCharNumBr(data[n]),
		};
		n += 1;
	}
	if (twct > 1) try addNewline();
	for (footer1[0..]) |byte| {
		try alist.append(byte);
	}
	try addNewline();
	for (footer2[0..]) |byte| {
		try alist.append(byte);
	}
	try addNewline();
	if (!std.mem.eql(u8, file_ex, file_in)) {
		try fileBufWrite(file_ex, alist.items);
	} else {
		for (alist.items) |byte| {
			try stdout.writeByte(byte);
			try stdout_buffered.flush();
		}
	}
}

pub fn showHelp() void {
	std.debug.print("Human-readable Plain-text Encoding for Binary Files\n                              [CLD rev. 2025-07-09]\n\nUsage (optional arguments first):\nxencode [-a|-x] [-w <number>] file_in [file_out] | [-?]\n\nIf file_out is omitted, output is directed to stdout.\n\nOptions:\n  -a outputs Ascii 32-126 only (not decodable)\n  -x applies XyWrite readability aids to output\n  -w <number> changes text width of output to <number>\n     (default = 65 characters per line)\n  -? shows this help\n", .{});
}

pub fn fileRead(fname: []const u8) ![]u8 {
	const fileContents = try std.fs.cwd().readFileAlloc(
		ap_allocator,
		fname,
		std.math.maxInt(u32));
	return fileContents;
}

pub fn fileBufWrite(file_path: []const u8, data: []const u8) !void {
	const file = try std.fs.cwd().createFile(file_path, .{});
	defer file.close();
	var bufwriter = std.io.bufferedWriter(file.writer());
	const writer = bufwriter.writer();
	try writer.print("{s}", .{data});
	try bufwriter.flush();
}

pub fn writeByteLnBk(byte: u8) !void {
	try alist.append(byte);
	twct += 1;
	if (twct > tw) {
		if (builtin.target.os.tag == .windows) try alist.append(13);
		try alist.append(10);
		twct = 1;
	}
}

pub fn writeAllLnBk(bytes: []const u8) !void {
	for (bytes) |b| try writeByteLnBk(b);
}

pub fn addNewline() !void {
	if (builtin.target.os.tag == .windows) try alist.append(13);
	try alist.append(10);
}

pub fn putCharNumBr(char_in: u8) !void {
	var buffer: [5]u8 = undefined;
	const charnum: []u8 = try std.fmt.bufPrint(
		&buffer,
		"{{{d:0>3}}}",
		.{char_in});
	try writeAllLnBk(charnum);
}

pub fn put3byterBr(byte1: u16, byte2: u16, byte3: u16) !void {
	var buffer: [13]u8 = undefined;
	const brace3: []u8 = try std.fmt.bufPrint(
		&buffer,
		"[{d:0>3}+{d:0>3}+{d:0>3}]", 
		.{byte1, byte2, byte3});
	try writeAllLnBk(brace3);
}

pub fn putXyFunc(byte_2: u16, byte_3: u16) !void {
	const index: u32 = (256 * (byte_2 % 128) / 2) + (byte_3 / 2);
	var afunc: [5]u8 = undefined;
	const func_no = [_]u8{'[', '2', '5', '5', '+', '1', '2', '9', '+',
		 '1', '6', '3', ']'};
	const func_nos = func_no[0..];
	const xyfuncs = [_]*const [2:0]u8{
"@0", "@1", "@2", "@3", "@4", "@5", "@6", "@7", "@8", "@9", "@A", "@B",
"@C", "@D", "@E", "@F", "@G", "@H", "@I", "@J", "@K", "@L", "@M", "@N",
"@O", "@P", "@Q", "@R", "@S", "@T", "@U", "@V", "@W", "@X", "@Y", "@Z",
"AD", "AS", "BF", "BK", "BS", "CC", "CD", "CH", "CI", "CL", "CM", "CN",
"CP", "CR", "CS", "CU", "DC", "DF", "GH", "DL", "DP", "DS", "DW", "EL",
"ER", "EX", "GT", "HM", "M0", "M1", "M2", "M3", "M4", "M5", "M6", "M7",
"M8", "MD", "MU", "MV", "NC", "NL", "NK", "NP", "NR", "NS", "NT", "NW",
"PC", "PD", "PL", "PP", "PR", "PS", "PT", "PU", "PW", "R0", "R1", "R2",
"R3", "R4", "R5", "R6", "R7", "R8", "R9", "RC", "RD", "RE", "RL", "RP",
"RS", "RV", "RW", "SD", "SH", "SI", "SK", "SM", "SN", "SS", "SU", "SV",
"TF", "TI", "TN", "TS", "UD", "WA", "WC", "WL", "WN", "WS", "WX", "WW",
"XC", "XD", "DT", "S1", "S2", "S3", "S4", "S5", "S6", "S7", "SP", "BC",
"LB", "LE", "NF", "PF", "TP", "BD", "MS", "NM", "LD", "LL", "LR", "LU",
"UP", "FF", "YD", "DO", "DX", "MK", "SO", "OP", "WZ", "NX", "SW", "FD",
"FM", "TL", "TR", "TE", "ED", "EE", "HC", "EC", "MC", "#1", "#2", "#3",
"#4", "#5", "#6", "#7", "#8", "#9", "$1", "$2", "$3", "$4", "$5", "$6",
"$7", "$8", "$9", "DR", "EN", "C0", "C1", "C2", "C3", "C4", "C5", "C6",
"C7", "C8", "C9", "EF", "IB", "NO", "NI", "CO", "$0", "LS", "XP", "WG",
"XM", "&0", "&1", "&2", "&3", "&4", "&5", "&6", "&7", "&8", "&9", "&A",
"&B", "&C", "&D", "&E", "&F", "&G", "&H", "&I", "&J", "&K", "&L", "&M",
"&N", "&O", "&P", "&Q", "&R", "&S", "&T", "&U", "&V", "&W", "&X", "&Y",
"&Z", "HL", "$A", "$B", "$C", "$D", "$E", "$F", "$G", "$H", "$I", "$J",
"$K", "$L", "$M", "$N", "$O", "$P", "$Q", "$R", "$S", "$T", "$U", "$V",
"$W", "$X", "$Y", "$Z", "XX", "H@", "VH", "MW", "QH", "DK", "SR", "SC",
"TG", "H1", "JH", "DZ", "DD", "DM", "LT", "RK", "NN", "MT", "ET", "ZT",
"T1", "TT", "<<", ">>", "IT", "SL", "SF", "FL", "FR", "FC", "SY", "ME",
"AC", "FS", "TW", "MI", "RO", "NB", "Q1", "Q2", "Q3", "Q4", "Q5", "Q6",
"Q7", "Q8", "TO", "IR", "AR", "AX", "DB", "DE", "HF", "SA", "OV", "TC",
"TB", "JM", "SG", "XH", "FT", "BX", "MN", "CB", "M9", "MZ", "ZZ", "RX",
"ST", "KF", "JC", "AK", "TM", "NU", "B4", "QP", "HG", "US", "XE", "ES",
"RB", "S-", "S+", "**", "BN", "RU", "CF", "UI", "XS", "EA", "BT", "KD",
"DN", "HI", "WH", "XN", "FX", "UN", "MX", "AZ", "BR", "HK", "#X", "??",
"BM", "JR", "XO", "XW", "TX", "LF", "LO", "BL", "XT", "WT", "IC", "CT",
"VB", "-D", "WD", "RM", "LM", "aL", "aR", "aB", "aE", "MP", "mN", "QL",
"QR", "MF"};
	if (index < xyfuncs.len) {
		afunc[0] = '[';
		afunc[1] = xyfuncs[index][0];
		afunc[2] = xyfuncs[index][1];
		if (afunc[1] == 'N' and afunc[2] == 'O') {
			try writeAllLnBk(func_nos); // workaround for flaky func NO
			return;
		}
		afunc[3] = '_';
		afunc[4] = ']';
		try writeAllLnBk(afunc[0..]);
	}
}

pub fn putXyWildcard(byte_3: u16) !void {
	var wild: [4]u8 = undefined;
	wild[0] = '[';
	wild[1] = 'w';
	var wild1: [5]u8 = undefined;
	wild1[0] = '[';
	wild1[1] = 'w';
	var wilddot = [_]u8{
		'[', '2', '5', '5', '+', '1', '9', '2', '+', '1', '7', '4', ']'};
	var c: u3 = 2;
	switch (byte_3) {
		46 => {wild[c] = '<'; c += 1;}, 
		47 => {wild[c] = '>'; c += 1;},
		145 => {wild1[c] = '1'; wild1[c + 1] = '3'; c += 2;},
		153 => {wild1[c] = '1'; wild1[c + 1] = '0'; c += 2;},
		155 => {wild[c] = 'C'; c += 1;}, 
		173 => {wild[c] = '-'; c += 1;}, 
		174 => {wild[c] = '.'; c += 1;}, 
		176 => {wild[c] = '0'; c += 1;}, 
		177 => {wild[c] = '1'; c += 1;}, 
		178 => {wild[c] = '2'; c += 1;}, 
		179 => {wild[c] = '3'; c += 1;}, 
		180 => {wild[c] = '4'; c += 1;}, 
		181 => {wild[c] = '5'; c += 1;}, 
		182 => {wild[c] = '6'; c += 1;}, 
		183 => {wild[c] = '7'; c += 1;}, 
		184 => {wild[c] = '8'; c += 1;}, 
		185 => {wild[c] = '9'; c += 1;}, 
		193 => {wild[c] = 'A'; c += 1;}, 
		204 => {wild[c] = 'L'; c += 1;}, 
		206 => {wild[c] = 'N'; c += 1;}, 
		207 => {wild[c] = 'O'; c += 1;}, 
		211 => {wild[c] = 'S'; c += 1;}, 
		215 => {wild[c] = 'W'; c += 1;}, 
		216 => {wild[c] = 'X'; c += 1;},
		else => wild[0] = 0,
	}
	if (wild[2] == '.') {
		try writeAllLnBk(wilddot[0..]);
		return;
	}
	if (c > 3) {
		wild1[c] = ']';
		try writeAllLnBk(wild1[0..]);
	} else {
		wild[c] = ']';
		try writeAllLnBk(wild[0..]);
	}
}

pub fn putXySpeedoChar(byte_2: u16, byte_3: u16) !void {
	const index: u32 = (256 * (1 + byte_3)) + byte_2;
	var buffer: [5]u8 = undefined;
	const speedo: []u8 = try std.fmt.bufPrint(
		&buffer,
		"[{d}]", .{index});
	try writeAllLnBk(speedo);
}

Why do you accumulate all your output in alist and then only write it at the end of the program?

Here is what I would do instead:

  • In a first step get all the data from the command line arguments
  • based on that data either create a buffered writer to stdout or open a write file and create a buffered writer to that (basically have a use_file boolean)
  • use that buffered writer and directly write the output to that, removing alist from the code
  • flush the buffered writer
pub fn main() !void {
    // get data from command line args

    const file = if(use_file) try std.fs.cwd().createFile(file_path, .{}) else undefined;
    defer if(use_file) file.close();
    var buffered = std.io.bufferedWriter(if(use_file) file.writer() else std.io.getStdOut().writer());
    const out = buffered.writer();

    // directly output to out

    try buffered.flush();
}
1 Like

The documentation of std.io.buffered_writer.bufferedWriter(underlying_stream) shows that the bufferedWriter function returns a value of type BufferedWriter(4096, @TypeOf(underlying_stream)). Here 4096 is the buffer size used.

If you want to specify your own buffer size, you should (currently) be able to do it as follows:

pub fn main() !void {
    // This would not specify buffer size and default to 4096:
    // var stdout_bufwriter = std.io.bufferedWriter(std.io.getStdOut().writer());
    //
    // Instead we do:
    const stdout_unbuf = std.io.getStdOut().writer();
    const bufsize = 8192;
    var stdout_bufwriter = std.io.BufferedWriter(
        bufsize,
        @TypeOf(stdout_unbuf),
    ){
        .unbuffered_writer = stdout_unbuf,
    };

    var stdout = stdout_bufwriter.writer();
    try stdout.print("Hello Buffer!\n", .{});
    try stdout_bufwriter.flush();
}

As I understand, the interface is going to be changed soon when I/O is being reworked for async/await. So in the near future, it might work differently. As I learned here, there is already a demo branch in the repository for the new async/await I/O concept, but I didn’t have a look yet, and not sure what state it is in.

On my system, 4096 happens to be equivalent to the memory page size, but I think that may be platform dependent.

$ getconf PAGESIZE
4096

Note, however, that BufferedWriter(4096, _) is slightly larger than 4096 bytes, so it won’t fit in one memory page (if memory page size is 4096). On my system:

pub fn main() void {
    std.debug.print("Total size = {}\n", .{
        @sizeOf(std.io.BufferedWriter(4096, std.fs.File)),
    });
}
Total size = 4112

Yet, the mere buffer (disregarding the data for the underlying stream) would be 4096 and thus fit into one memory page for many platforms, I assume. And those 4096 bytes would then also be the amount that is written per system call, I believe.

As far as I understand, the BufferedWriter isn’t necessarily on the heap, but if you simply assign it to a variable in main, it would be on the stack. BufferedWriter does not perform further allocation.

I don’t know. I would assume that a buffer size of 8192, for example, would result in the half amount of system calls required at the expense of a larger latency and more memory use.

I personally wouldn’t worry that much and use the default values unless you want to transfer/generate really big amounts of data (maybe if you write several gigabytes in short time?). But I don’t know really.

1 Like

@Sze: Oh, this is elegant. Thank you! I’ve tried to implement it, here:

// xencode.zig: Human-readable plain-text encoding for binary files
// CLD rev. 2025-07-11_16:20

const std = @import("std");
const builtin = @import("builtin");
var arena = std.heap.ArenaAllocator.init(std.heap.page_allocator);
const ap_allocator = arena.allocator();
var tw: u8 = 65; // output text width
var twct: u8 = 1; // tw counter

pub fn main() !void {
	defer arena.deinit();
	const args = try std.process.argsAlloc(ap_allocator);
	if (args.len < 2) {
		showHelp();
		return;
	}
	var omode: i8 = 0; // output mode: -1=printable only,
                     // 0=suppress XyWrite readability aids
	var i: u8 = 1;
	var byte2: u16 = undefined;
	var byte3: u16 = undefined;
 	const tab: []const u8 = "{tab}";
	const crlf0: []const u8 = "[013+010]";
	const crlf: []const u8 = "[cr|lf]";
	const lbrc: []const u8 = "{091}";
	const rbrc: []const u8 = "{093}";
	const lguil: []const u8 = "{<}";
	const rguil: []const u8 = "{>}";
	const spce: []const u8 = "{032}";
	var ctr: u8 = 0;
	var n: u32 = 0;
	var use_file: bool = false;

// Poll command-line options (all -# option args must come first)
	while (i < args.len) {
		if (std.mem.eql(u8, args[i], "-?")) {
			showHelp();
			return;
		}
		if (std.mem.eql(u8, args[i], "/?")) {
			showHelp();
			return;
		}
		if (std.mem.eql(u8, args[i], "--help")) {
			showHelp();
			return;
		}
		if (std.mem.eql(u8, args[i], "-a")) {
			omode = -1;
			i += 1;
			continue;
		}
		if (std.mem.eql(u8, args[i], "/a")) {
			omode = -1;
			i += 1;
			continue;
		}
		if (std.mem.eql(u8, args[i], "-x")) {
			omode = 1;
			i += 1;
			continue;
		}
		if (std.mem.eql(u8, args[i], "/x")) {
			omode = 1;
			i += 1;
			continue;
		}
		if (std.mem.eql(u8, args[i], "-w")) {
			if (i + 1 <= args.len - 1) {
		    const twa = try std.fmt.parseInt(u8, args[i + 1], 10);
				tw = twa;
				i += 2;
				continue;
			}
		}
		if (std.mem.eql(u8, args[i], "/w")) {
			if (i + 1 <= args.len - 1) {
		    const twa = try std.fmt.parseInt(u8, args[i + 1], 10);
				tw = twa;
				i += 2;
				continue;
			}
		}
		break;
	}
	if (i > args.len - 1) {
		showHelp();
		return;
	}
	const file_in: []u8 = args[i];
	std.fs.cwd().access(file_in, .{}) catch |err| {
		switch (err) {
			error.FileNotFound => {
				std.debug.print("File not found: \"{s}\"\n", .{file_in});
				return;
			},
			else => unreachable,
		}
	};
	var file_ex: []u8 = undefined;
	if (i + 1 == args.len - 1) {
		file_ex = args[i + 1];
		use_file = true;
	}
	const data: []u8 = try fileRead(file_in);
	const file = if(use_file) try std.fs.cwd().createFile(file_ex, .{})
		else undefined;
	defer if(use_file) file.close();
	var buffered = std.io.bufferedWriter(if(use_file) file.writer()
		else std.io.getStdOut().writer());
	const out = buffered.writer();
	try out.writeAll("XPLeNCODE v2.0 (xencode.exe)");
	try addNewline(out);
	try out.writeAll("b-gin [UNTITLED]");
	try addNewline(out);
	while (n < data.len) {
		if (omode < 0) {
			if (data[n] > 32 and data[n] < 127) {
				try writeByteLnBk(data[n], out);
			}
			else if (data[n] == 32) {
				if (twct > 1 and twct < tw - 1) try writeByteLnBk(' ', out)
					else try writeAllLnBk(spce, out);	
			} else {
				try writeByteLnBk('.', out);
			}
			n += 1;
			continue;
		}
		ctr = 0;
		try switch (data[n]) {
			9 => {
				switch (omode) {
					0 => try putCharNumBr(9, out),
					1 => try writeAllLnBk(tab, out),
					else => try writeByteLnBk('.', out),
				}
			},
			13 => {
				if (omode < 0) {
					try writeByteLnBk('.', out);
					n += 1;
					continue;
				}
				if (data[n + 1] == 10) {
					if (omode > 0) {
						try writeAllLnBk(crlf, out);
					} else {
						try writeAllLnBk(crlf0, out);
					}
					n += 1;
				} else {
					try putCharNumBr(data[n], out);
				}
			},
			32 => {
				if (twct > 1 and twct < tw - 1) try writeByteLnBk(' ', out)
					else try writeAllLnBk(spce, out);	
			},
			33...90 => try writeByteLnBk(data[n], out),
			91 => {
				if (omode > 0) try writeAllLnBk(lbrc, out)
					else try writeByteLnBk(data[n], out);
			},
			92 => try writeByteLnBk(data[n], out),
			93 => {
				if (omode > 0) try writeAllLnBk(rbrc, out)
					else try writeByteLnBk(data[n], out);
			},
			94...122 => try writeByteLnBk(data[n], out),
			123 => putCharNumBr(data[n], out),
			124 => try writeByteLnBk(data[n], out),
			125 => putCharNumBr(data[n], out),
			174 => {
				if (omode > 0) try writeAllLnBk(lguil, out)
					else try putCharNumBr(data[n], out);
			},
			175 => {
				if (omode > 0) try writeAllLnBk(rguil, out)
					else try putCharNumBr(data[n], out);
			},
			254 => { // Speedo charset
				if (omode < 1) {
					try putCharNumBr(data[n], out);
					n += 1;
					continue;
				}
				if (data.len - n < 3) {
					try putCharNumBr(data[n], out);
					n += 1;
					continue;
				}
				byte2  = data[n + 1];
				byte3  = data[n + 2];
				if (byte3 < 2 or (byte3 < 3 and byte2 < 232)) {
					try putXySpeedoChar(byte2, byte3, out);
					n += 3;
				} else {
 				// Generic 3-byter [254+nnn+nnn]
					try put3byterBr(254, byte2, byte3, out);
					n += 3;
					continue;
				}
			},
			255 => {
				if (omode < 1) {
					try putCharNumBr(data[n], out);
					n += 1;
					continue;
				}
				if (data.len - n < 3) {
					try putCharNumBr(data[n], out);
					n += 1;
					continue;
				}
				byte2  = data[n + 1];
				byte3  = data[n + 2];
				// 3-byte functions
				if (byte2 > 127 and byte2 < 131 and byte3 % 2 == 1) {
						try putXyFunc(byte2, byte3, out);
						n += 3;
						continue;
				}
				// Search wildcards
				else if ((byte2 == 192 and (byte3 == 145 or byte3 == 153 or byte3 == 155 or byte3 == 173 or byte3 == 174 or (byte3 >= 176 and byte3 <= 185) or byte3 == 193 or byte3 == 204 or byte3 == 206 or byte3 == 207 or byte3 == 211 or byte3 == 215 or byte3 == 216)) or (byte2 == 193 and (byte3 == 46 or byte3 == 47))) {
					try putXyWildcard(byte3, out);
					n += 3;
					continue;
				} else {
 				// Generic 3-byter [255+nnn+nnn]
					try put3byterBr(255, byte2, byte3, out);
					n += 3;
					continue;
				}
			},
			else => try putCharNumBr(data[n], out),
		};
		n += 1;
	}
	if (twct > 1) try addNewline(out);
	try out.writeAll("-nd");
	try addNewline(out);
	try out.writeAll("XPLeNCODE");
	try addNewline(out);
	try buffered.flush();
}

pub fn showHelp() void {
	std.debug.print("Human-readable Plain-text Encoding for Binary Files\n                              [CLD rev. 2025-07-11]\n\nUsage (optional arguments first):\nxencode [-a|-x] [-w <number>] file_in [file_out] | [-?]\n\nIf file_out is omitted, output is directed to stdout.\n\nOptions:\n  -a outputs Ascii 32-126 only (not decodable)\n  -x applies XyWrite readability aids to output\n  -w <number> changes text width of output to <number>\n     (default = 65 characters per line)\n  -? shows this help\n", .{});
}

pub fn fileRead(fname: []const u8) ![]u8 {
	const fileContents = try std.fs.cwd().readFileAlloc(
		ap_allocator,
		fname,
		std.math.maxInt(u32));
	return fileContents;
}

pub fn writeByteLnBk(byte: u8, wr: anytype) !void {
	try wr.writeByte(byte);
	twct += 1;
	if (twct > tw) {
		if (builtin.target.os.tag == .windows) try wr.writeByte(13);
		try wr.writeByte(10);
		twct = 1;
	}
}

pub fn writeAllLnBk(bytes: []const u8, wr: anytype) !void {
	for (bytes) |b| try writeByteLnBk(b, wr);
}

pub fn addNewline(wr: anytype) !void {
	if (builtin.target.os.tag == .windows) try wr.writeByte(13);
	try wr.writeByte(10);
}

pub fn putCharNumBr(char_in: u8, wr: anytype) !void {
	var buffer: [5]u8 = undefined;
	const charnum: []u8 = try std.fmt.bufPrint(
		&buffer,
		"{{{d:0>3}}}",
		.{char_in});
	try writeAllLnBk(charnum, wr);
}

pub fn put3byterBr(byte1: u16, byte2: u16, byte3: u16, wr: anytype) !void {
	var buffer: [13]u8 = undefined;
	const brace3: []u8 = try std.fmt.bufPrint(
		&buffer,
		"[{d:0>3}+{d:0>3}+{d:0>3}]", 
		.{byte1, byte2, byte3});
	try writeAllLnBk(brace3, wr);
}

pub fn putXyFunc(byte_2: u16, byte_3: u16, wr: anytype) !void {
	const index: u32 = (256 * (byte_2 % 128) / 2) + (byte_3 / 2);
	var afunc: [5]u8 = undefined;
	const func_no = [_]u8{'[', '2', '5', '5', '+', '1', '2', '9', '+',
		 '1', '6', '3', ']'};
	const func_nos = func_no[0..];
	const xyfuncs = [_]*const [2:0]u8{
"@0", "@1", "@2", "@3", "@4", "@5", "@6", "@7", "@8", "@9", "@A", "@B",
"@C", "@D", "@E", "@F", "@G", "@H", "@I", "@J", "@K", "@L", "@M", "@N",
"@O", "@P", "@Q", "@R", "@S", "@T", "@U", "@V", "@W", "@X", "@Y", "@Z",
"AD", "AS", "BF", "BK", "BS", "CC", "CD", "CH", "CI", "CL", "CM", "CN",
"CP", "CR", "CS", "CU", "DC", "DF", "GH", "DL", "DP", "DS", "DW", "EL",
"ER", "EX", "GT", "HM", "M0", "M1", "M2", "M3", "M4", "M5", "M6", "M7",
"M8", "MD", "MU", "MV", "NC", "NL", "NK", "NP", "NR", "NS", "NT", "NW",
"PC", "PD", "PL", "PP", "PR", "PS", "PT", "PU", "PW", "R0", "R1", "R2",
"R3", "R4", "R5", "R6", "R7", "R8", "R9", "RC", "RD", "RE", "RL", "RP",
"RS", "RV", "RW", "SD", "SH", "SI", "SK", "SM", "SN", "SS", "SU", "SV",
"TF", "TI", "TN", "TS", "UD", "WA", "WC", "WL", "WN", "WS", "WX", "WW",
"XC", "XD", "DT", "S1", "S2", "S3", "S4", "S5", "S6", "S7", "SP", "BC",
"LB", "LE", "NF", "PF", "TP", "BD", "MS", "NM", "LD", "LL", "LR", "LU",
"UP", "FF", "YD", "DO", "DX", "MK", "SO", "OP", "WZ", "NX", "SW", "FD",
"FM", "TL", "TR", "TE", "ED", "EE", "HC", "EC", "MC", "#1", "#2", "#3",
"#4", "#5", "#6", "#7", "#8", "#9", "$1", "$2", "$3", "$4", "$5", "$6",
"$7", "$8", "$9", "DR", "EN", "C0", "C1", "C2", "C3", "C4", "C5", "C6",
"C7", "C8", "C9", "EF", "IB", "NO", "NI", "CO", "$0", "LS", "XP", "WG",
"XM", "&0", "&1", "&2", "&3", "&4", "&5", "&6", "&7", "&8", "&9", "&A",
"&B", "&C", "&D", "&E", "&F", "&G", "&H", "&I", "&J", "&K", "&L", "&M",
"&N", "&O", "&P", "&Q", "&R", "&S", "&T", "&U", "&V", "&W", "&X", "&Y",
"&Z", "HL", "$A", "$B", "$C", "$D", "$E", "$F", "$G", "$H", "$I", "$J",
"$K", "$L", "$M", "$N", "$O", "$P", "$Q", "$R", "$S", "$T", "$U", "$V",
"$W", "$X", "$Y", "$Z", "XX", "H@", "VH", "MW", "QH", "DK", "SR", "SC",
"TG", "H1", "JH", "DZ", "DD", "DM", "LT", "RK", "NN", "MT", "ET", "ZT",
"T1", "TT", "<<", ">>", "IT", "SL", "SF", "FL", "FR", "FC", "SY", "ME",
"AC", "FS", "TW", "MI", "RO", "NB", "Q1", "Q2", "Q3", "Q4", "Q5", "Q6",
"Q7", "Q8", "TO", "IR", "AR", "AX", "DB", "DE", "HF", "SA", "OV", "TC",
"TB", "JM", "SG", "XH", "FT", "BX", "MN", "CB", "M9", "MZ", "ZZ", "RX",
"ST", "KF", "JC", "AK", "TM", "NU", "B4", "QP", "HG", "US", "XE", "ES",
"RB", "S-", "S+", "**", "BN", "RU", "CF", "UI", "XS", "EA", "BT", "KD",
"DN", "HI", "WH", "XN", "FX", "UN", "MX", "AZ", "BR", "HK", "#X", "??",
"BM", "JR", "XO", "XW", "TX", "LF", "LO", "BL", "XT", "WT", "IC", "CT",
"VB", "-D", "WD", "RM", "LM", "aL", "aR", "aB", "aE", "MP", "mN", "QL",
"QR", "MF"};
	if (index < xyfuncs.len) {
		afunc[0] = '[';
		afunc[1] = xyfuncs[index][0];
		afunc[2] = xyfuncs[index][1];
		if (afunc[1] == 'N' and afunc[2] == 'O') {
			try writeAllLnBk(func_nos, wr); // workaround for flaky func NO
			return;
		}
		afunc[3] = '_';
		afunc[4] = ']';
		try writeAllLnBk(afunc[0..], wr);
	}
}

pub fn putXyWildcard(byte_3: u16, wr: anytype) !void {
	var wild: [4]u8 = undefined;
	wild[0] = '[';
	wild[1] = 'w';
	var wild1: [5]u8 = undefined;
	wild1[0] = '[';
	wild1[1] = 'w';
	var wilddot = [_]u8{
		'[', '2', '5', '5', '+', '1', '9', '2', '+', '1', '7', '4', ']'};
	var c: u3 = 2;
	switch (byte_3) {
		46 => {wild[c] = '<'; c += 1;}, 
		47 => {wild[c] = '>'; c += 1;},
		145 => {wild1[c] = '1'; wild1[c + 1] = '3'; c += 2;},
		153 => {wild1[c] = '1'; wild1[c + 1] = '0'; c += 2;},
		155 => {wild[c] = 'C'; c += 1;}, 
		173 => {wild[c] = '-'; c += 1;}, 
		174 => {wild[c] = '.'; c += 1;}, 
		176 => {wild[c] = '0'; c += 1;}, 
		177 => {wild[c] = '1'; c += 1;}, 
		178 => {wild[c] = '2'; c += 1;}, 
		179 => {wild[c] = '3'; c += 1;}, 
		180 => {wild[c] = '4'; c += 1;}, 
		181 => {wild[c] = '5'; c += 1;}, 
		182 => {wild[c] = '6'; c += 1;}, 
		183 => {wild[c] = '7'; c += 1;}, 
		184 => {wild[c] = '8'; c += 1;}, 
		185 => {wild[c] = '9'; c += 1;}, 
		193 => {wild[c] = 'A'; c += 1;}, 
		204 => {wild[c] = 'L'; c += 1;}, 
		206 => {wild[c] = 'N'; c += 1;}, 
		207 => {wild[c] = 'O'; c += 1;}, 
		211 => {wild[c] = 'S'; c += 1;}, 
		215 => {wild[c] = 'W'; c += 1;}, 
		216 => {wild[c] = 'X'; c += 1;},
		else => wild[0] = 0,
	}
	if (wild[2] == '.') {
		try writeAllLnBk(wilddot[0..], wr);
		return;
	}
	if (c > 3) {
		wild1[c] = ']';
		try writeAllLnBk(wild1[0..], wr);
	} else {
		wild[c] = ']';
		try writeAllLnBk(wild[0..], wr);
	}
}

pub fn putXySpeedoChar(byte_2: u16, byte_3: u16, wr: anytype) !void {
	const index: u32 = (256 * (1 + byte_3)) + byte_2;
	var buffer: [5]u8 = undefined;
	const speedo: []u8 = try std.fmt.bufPrint(
		&buffer,
		"[{d}]", .{index});
	try writeAllLnBk(speedo, wr);
}

1 Like

@jbe: I appreciate the detailed explanation. As far as I can tell, a 4096-byte buffer is fine for this purpose. For what it’s worth, in my testing, output files of several hundred MB are written instantaneously. Leaving well enough alone seems to be the way to go.

Question: Am I okay with the global variables I have, or should I try to push some down into main()?

It’s definitely not ‘good practice’ to have unnecessary global variables, but for a little self-contained tool like this, it’s up to you to decide how much that matters.

However, if your goal for this project is more so to get better at writing Zig rather than to just write a working tool, I would suggest the following exercise:

  • With the help of std.io.GenericWriter, make a generic maxColumnWriter function that gives you a struct which provides a writer that automatically inserts line breaks when needed:
pub fn maxColumnWriter(writer: anytype, max_columns: u8) MaxColumnWriter(@TypeOf(writer)) {
    ...
}

pub fn MaxColumnWriter(comptime Writer: type) type {
    ...
}

// Usage:
//
// const original_writer = ...
//
// var mcw = maxColumnWriter(original_writer, 123);
//
// const max_column_writer = mcw.writer();
  • Encapsulate all configuration options in a Config struct:
pub const Config = struct {
    ...

    pub fn initFromArgs(args: []const []const u8) !Config {
        ...
    }
};
  • Makeomode an enum instead of an i8.
  • Consolidate switch branches that execute the same logic. You can do this by including multiple values and ranges in one branch, like so:
33...90, 92, 94...122, 124 => try writeByteLnBk(data[n], out), 
2 Likes