Uh, how does fscanf() work?

This question is completely unrelated to Zig. Currently, I’m trying to get my WASI handlers (implemented in JavaScript) to work with the WebAssembly libc bundled with the Zig compiler. I’m having troubling getting fscanf() and scanf() to work. I program in C on and off but I can’t recall using these ancient functions in the last three decades. I don’t understand what they do exactly. I notice right now that a call to fscanf() would trigger a call to fd_read() for 1024 bytes. My handler would return 3 lines separated by \n–the full content of the “file”. But only first line gets read by fscanf(). Here’s the test code:

    int a, b, c;
    char buffer[128];
    int count;
    do {
        count = fscanf(file, "%d %d %d %s", &a, &b, &c, buffer);
        printf("%d %d %d %s", a, b, c, buffer);
    } while (count > 0);

Sorry for the off-topic post but I’m really stumped on this.

first a reminder about the scanf family of functions

Since you rarely use it, have trouble and I currently have to know it because of my job, let me remind you how these functions work. fsanf reads from the FILE* given in the first parameter and tries to match the format string.

In your case it’s an integer separate by a random amount of whitespace (yes, it can be multiple spaces and even tabs; anything accepted by isspace to be exact), followed by another integer followed by more whitespace followed by another integer followed by more whitespace and then followed by an arbitrarily large string until a whitespace is encountered.

This also means that you can get a buffer overflow here. If you really want to use one of the scanf functions for some reason (I would recommend no), put a maximum in the format string, e.g. %127s (don’t forget the null byte; it’s not included in that number). A string of 3 numbers followed by a word of 128 characters would trigger a buffer overflow with your format string for example.

One a sidenote: according the the manpage on my system, whitespace is always implicit. So "%d %d", "%d%d", "%d %d" and "%d\t%d" and identical in meaning as far as the scanf family of functions are concerned. This includes the input string too and that’s also how it works:

#include <stdio.h>

void make(char* format) {
    char* buf = "5     \t  5";
    int a = 0, b = 0;
    sscanf(buf, format, &a, &b);
    printf("a = %d\nb = %d\n", a, b);
}

int main() {
    make("%d%d");
    make("%d %d");
    make("%d   \t%d");
}

your program in specific

I put your code into Compiler Explorer.

On normal Linux it seems to work as expected with GCC.

When building it with Zig CC, there are a few warning about the options which seem to be a Compiler Explorer thing, but it works on normal Linux too.

It doesn’t compile to WebAssembly because Compiler Explorer has a too strict fd quota because Zig tries to compile the libc. Maybe the Zig compiler should learn how to handle such a restriction gracefully instead of just refusing to work, but that’s a different topic.

Compiling it on my machine with zig cc -target wasm32-wasi -o wasmtest.wasm wasmtest.c -g with Zig 0.15.1 worked, so I ran it with wasmtime run --dir . wasmtest.wasm and it produced the output I expected.

So, I can’t reproduce it. Can you provide a full example which reproduces your problem, please?

2 Likes

Okay, I’m an idiot. I forgot to call fflush() on stdout :zany_face:. Because my test code involves a C function called from JavaScript, there isn’t a program termination event that would trigger autoflush. Output from printf() ends up left behind in the buffer when the function returns.

I’m just going to setbuf() stdout to null on initialization.