`std.fmt.format` in aarch64 freestanding

I am following raspi3 barebones kernel guides and translating them to Zig to learn as I go through.

I have encountered an issue however where my kernel seems to crash when using std.fmt.format().

I’ve implemented Writer like so:

const TerminalWriter = struct {
    const Self = @This();
    pub const Error = error{};

    pub fn write(_: Self, bytes: []const u8) !usize {
        writeString(bytes);
        return bytes.len;
    }

    pub fn writeByte(self: Self, byte: u8) !void {
        _ = try self.write(&.{byte});
    }

    pub fn writeBytesNTimes(self: Self, bytes: []const u8, n: usize) !void {
        for (0..n) |_| {
            _ = try self.write(bytes);
        }
    }

    pub fn writeAll(_: Self, bytes: []const u8) !void {
        writeString(bytes);
    }
};

which is called here:

pub fn print(comptime format: []const u8, args: anytype) void {
    std.fmt.format(TerminalWriter{}, format, args) catch unreachable;
}

It results in the following output:

Synchronous: Unknown:
  ESR_EL1 000000001FE00000 ELR_EL1 00000000000845E0
 SPSR_EL1 00000000600003C4 FAR_EL1 0000000000000000

Interestingly, if I change the writer functions to all call writeString() directly, it works, but fails again as soon as I add any args.

Any ideas - am I missing anything? I’ve looked through the code for format() and can’t see anything obvious…

For reference, my target is:

const target = b.standardTargetOptions(.{ .default_target = .{
        .abi = .eabihf,
        .os_tag = .freestanding,
        .cpu_arch = .aarch64,
        .cpu_model = .{ .explicit = &std.Target.arm.cpu.cortex_a53 },
    } });
1 Like

I don’t see a problem in your current code, have you shown how you try to use arguments? (you didn’t show TerminalWriter’s print method)

I think you may create an infinite loop of your TerminalWriter using formatting which then again uses the TerminalWriter which tries to use formatting…

When you use formatting within the writer you need to be careful to make sure the problem gets reduced to something simpler instead of just recusing to the same thing looping endlessly, until stack overflow…

But I am not entirely sure if that is what happens with your code, it seems to me the problem isn’t actually shown in the code you are showing, either that, or I am just not noticing it.

Basically the print method of your TerminalWriter can’t just loop back to using std.fmt.format with the arguments again, instead maybe something like this would work:

pub fn print(_: *Self, comptime format: []const u8, args: anytype) !void {
    var buffer: [4096]u8 = undefined; // some kind of temporary buffer to do the conversion to string, without recursing back to the implementation we currently write
    var fba = std.heap.FixedBufferAllocator.init(&buffer);
    const allocator = fba.allocator();

    const str = try std.fmt.allocPrint(allocator, format, args);
    writeString(str);
}

But it also might be related to freestanding directly and I am not very familiar with that.

3 Likes

What is writeString()? And does it perhaps expect a null terminated C string?

Sorry, probably should have posted my writeString() code too:

fn writeString(bytes: []const u8) void {
    for (bytes) |byte| {
        writeChar(byte);
    }
}

fn writeChar(c: u8) void {
    // framebuffer shifting removed for brevity 
    const x: usize = (cursor % screen_width) * fonts.FONT_WIDTH;
    const y: usize = (cursor / screen_width) * fonts.FONT_HEIGHT;
    fb.drawGlyph(c, @as(u32, @intCast(x)), @as(u32, @intCast(y)), current_foreground, current_background);
    cursor += 1;
    }
}

When I call print():

    print("Hello, world!\n", .{});
    print("Hello, {s}", .{"world!"});

First line works and is printed, second like prints up to the space and then crashes…

Use the GenericWriter to implement the Writer. (From its return type call any to get the std.io.Writer interface implementation).

const Context = void;
const WriteError = anyerror; // or error{}
fn writeFn(_: Context, bytes: []const u8) WriteError!usize {
    writeString(bytes);
    return bytes.len;
}
const TerminalWriter = std.io.GenericWriter(Context, WriteError, writeFn);
3 Likes

Oh very nice, thank you!
Unfortunately, it doesn’t change the result…

I am really not experienced with freestanding, but maybe you can use qemu to get a debug setup where you can see where it crashes / get something like a debug breakpoint / stack trace?

1 Like

If it helps in debugging, I have some minimal freestanding apps which write to a terminal via qemu.

zig build
qemu-system-riscv32 -machine virt -nographic -bios samples/shell/zig-out/bin/shell.bin

(ctrl-a, ctrl-x quits qemu)

Here’s the relevant Writer.

1 Like

This was going to be another question, I suppose…
I can’t seem to get GDB going with my micro kernel.

Which is rather frustrating as I was hoping to come with more information…

I get a:

warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.

And if I try using the file command:

"kernel.img": not in executable format: file format not recognized

Full code available here:

(apologies if it looks like garbage, I haven’t really touched Zig since 0.6 which is partly why I wanted to pick it back up…)

I encountered another issue with one of initial framebuffer guide which made me think the \\ behaviour does something weird because I was getting a broken image until I did a cImport with the data in a C header file… But that’s a whole other topic, I guess…

1 Like

If it helps, I do seem to be seeing a different exception now:

Synchronous: Data abort, same EL, Address size fault at level 1:
  ESR_EL1 0000000096000021 ELR_EL1 0000000000082698
 SPSR_EL1 00000000600003C4 FAR_EL1 0000000000087FEF

Which is certainly more helpful than nothing/unknown. But still not entirely helpful for me…

The other thing, I found an x86 kernel that is using the same/similar code which seems to work which I have been able to run locally just fine. Could I be facing something aarch64/freestanding specific or am I missing something obvious?

Have a look at these:

Maybe something like that would work to get debugging to work?

1 Like

So I’ve got gdb up and running but, it can’t seem to break on any exceptions which is a bit frustrating.
I was however able to take a look at the address in ELR_EL1 which points to:

(gdb) x 0x82698
0x82698 <fmt.formatType__anon_2084+104>:        0xb9400109
(gdb) info symbol 0x82698
fmt.formatType__anon_2084 + 104 in section .text
(gdb) info line *0x82698
Line 656 of "/usr/local/Cellar/zig/0.13.0/lib/zig/std/fmt.zig" starts at address 0x82698 <fmt.formatType__anon_2084+104> and ends at 0x826c8 <fmt.formatType__anon_2084+152>.

which is referring to here:

if (actual_fmt[0] == 's' and info.child == u8) {
        // This line below:
        return formatBuf(&value, options, writer);
}

If I put a break and try to step through, it just goes straight to my exception handlers:

Thread 1 hit Breakpoint 1, fmt.formatType__anon_2084 (value=..., options=..., max_depth=3) at /usr/local/Cellar/zig/0.13.0/lib/zig/std/fmt.zig:656
656                     return formatBuf(&value, options, writer);
(gdb) step
0x0000000000080800 in _vectors ()
(gdb) step
Single stepping until exit from function _vectors,
which has no line number information.
main.exc_handler (ex_type=557202, esr=6, elr=557202, spsr=6, far=32) at main.zig:161
161     export fn exc_handler(ex_type: u64, esr: u64, elr: u64, spsr: u64, far: u64) void {
(gdb)
1 Like

It appears to be a problem with accessing value which comes from @field(args, fields_info[arg_to_print].name).

It seems any time it’s attempted to be accessed, it crashes. Even assigning it to a discard results in a crash with the same Data abort, address size fault… But only from within the switch :thinking:

1 Like

I believe I’ve figured out what the cause is. It looks like it has an issue when CurrentEL is set to exception level 1…
Code has no issues on EL 3… Would this be a bug somewhere?

Also appears to affect when build .optimize is set to .Debug