Cooperative multitasking in WASM

For context, I am writing a keyboard based cross platform calorie tracker in zig using sokol (for transparency goals to sell it commercially to make money). I want the project to compile to WASM, (I am using zig 15.2). I tried to make sure I am using the correct terms here, but WASM and threading are very much not my area of expertise (english too, for that matter), I hope my description is clear enough. (feel free to tell me to rename this something better)

I want thread A and B to be able to yield to each other like a stackful coroutine. So the pattern is thread A yields to thread B, thread B runs, then yields to thread A, and they essentially ping pong off each other. (Sokol runs on the main thread, and my logic runs on a second thread.) For desktop it was very easy to do this by running 2 threads and having semaphores tell each other when to switch turns, but that didn’t work for WASM (it’s not exactly clear to me what the rules are for this thing in WASM, as well as what the language currently can/can’t do with the IO changes).

I failed to find how to do this, a worse case scenario would be me having to refactor all that logic into a finite state machine, but I would rather avoid that if there’s an alternative.

WASM by itself has no concept of threading. When you use threads on zig and compile to wasm, you are either using wasi threads or emscripten threads (pthreads). The actual threading is done by the runtime that runs your program. Emscripten (and prob wasmer wasi?) uses WebWorkers for this, instancing the same wasm program on a web worker and running the entry point there.

Wasmtime w/ wasi uses OS threads.

I’m not sure what the limitations of WebWorkers are, but I feel like you can’t ever block the main thread on JS. You probably could alternatively implement couroutines using either JSPI or emscripten’s asyncify (which also uses JSPI if available) Asynchronous Code — Emscripten 5.0.4-git (dev) documentation without having to run any webworkers at all.

1 Like

WASM can work with this model, depending on how your code does the yielding.
Without WASI, you won’t have threading in the traditional sense. You will have a module that exposes some functions that you can call from the host. The module can only run when the host calls into one of those functions.

The easiest way would be to expose 2 functions, one for sokol/rendering, one for business logic. The host would run the functions one after the other in a loop. This assumes that the functons terminate after they have done their work. Since you already have semaphore stop conditions, you could use those as your breaking point.

Another option would be to have 1 exposed function (i.e. tick, poll) and the host just calls that one function. Who’s turn it is can be maintained inside the module and on each call to the function, you check which turn it is.

These two options keep it simple (only one module, only one instance), but rely on the host to do all the driving.

2 Likes

What I understood is something like

while (true) {
   threadAEvent();
   threadBEvent();
}

I suppose my problem that I am trying to solve is that the thread for my logic needs to “store” a lot of information on the stack between calls, where this seems like it starts from a “blank slate”.

Since you are using emscripten, I noticed emscripten also has fibers https://emscripten.org/docs/api_reference/fiber.h.html

1 Like

Thanks you have given me a lot of reading to do, this is very unfamiliar territory for me.

Contrary to popular belief, WASM supports pthreads-style shared-memory-threading and Emscripten provides a POSIX-threads compatible C-API (I don’t know though how that’s handled in the Zig stdlib).

It comes with one massive caveat though: the web server must be configured to add ‘COOP/COEP’ headers to HTTP responses for security reasons (e.g. “cross-origin-isolation” - basically telling the browser that the page needs to run isolated in its own process).

E.g. check here (Security Requirements) SharedArrayBuffer - JavaScript | MDN)

Also see the Emscripten docs on pthreads support:

I never tinkered with WASM pthreads though because the COOP/COEP requirement locks out most hosting providers (like Github pages or AFAIK also itchio).

Tbh, unless you absolutely need multithreading for performance, I would recommend sticking to the main thread and if things like loading assets needs to be performed in the ‘background’ while rendering continues, jump out into Javascript functions and use Promise.then().catch() where the then() and catch() code calls ‘completion callbacks’ on the WASM side - traditional completion callback don’t require dealing with the ASYNCIFY magic code transform, and since you are rendering frames anyway, polling each frame for completion of background tasks isn’t a big deal.

If you just need to load assets in the background you can also check out sokol_fetch.h, this basically shields you from having to write Javascript :wink: There’s now also a Zig-binding for it (sokol-zig/src/sokol/fetch.zig at master · floooh/sokol-zig · GitHub) but I haven’t actually tested it on Zig (e.g. there’s no Zig example for sokol-fetch).

The sokol-samples repository has a couple of C examples for loading things in the background via sokol_fetch.h, e.g.:

(click the ‘src’ link at the bottom to see the C source code)

There is also a ‘local workaround’ for the COOP/COEP issue via ‘service workers’, but that isn’t reliable unfortunately:

3 Likes

If browser supports JSPI (javascript promise integration), ASYNCIFY is not that bad. In the web frontend for the platform I’m working on I actually require JSPI for functions like preadv and so on.
https://cloudef.pw/sorvi/#supertux.sorvi (example here, requires javascript.options.wasm_js_promise_integration on firefox, this does not use emscripten at all and it can load any compatible wasm binary)

I’m just slightly annoyed that such basic feature like JSPI (suspend and resume wasm execution) is such a recent feature on browsers.

2 Likes

First of all thanks for writing Sokol, it’s the only library I tried that just immediately worked for all platforms and that got me sold. It’s also somewhat surreal to be in a community where essentially the authors of my entire tech stack are present in the same forum.

The problem isn’t performance at all actually, everything in the codebase runs pretty much instantly, it’s the way that the logic needs to be written. That’s what made me use threads.

From what I read you can use multiple threads in WASM as long as the main thread doesn’t block, but in my initial setup the problem is that while the “business” thread is running the sokol thread is waiting.

The app is similar to a terminal in the sense that there is a text buffer, the sokol thread writes on that buffer and the “business thread” just reads it as if reading text from stdin is the illusion I was trying to create with the API. This is essentially how a function from the “business thread” looks like.

fn foo() void {
    const s = UI.readNext();
    //... do stuff1
    const s2 = UI.readNext();
    //... do stuff2
    const s3 = UI.readNext();
    //... do stuff3 etc.
}

Where as if I was calling this step by step from the main thread I would have to store the variables for “do stuff” somewhere as well as what code should be executed (immediately thinking for a finite state machine).

1 Like

Is the problem here that you don’t want UI.readNext to block? Ah never mind, you want the opposite.

1 Like

Yeah essentially Sokol hijacks the main thread, and from what I read the main thread is not allowed to block. (correct me if I am wrong here). So when the “business” thread actually gets the string, the sokol thread is blocked with the semaphore setup. The Sokol thread reads from memory the “business” thread modifies so I am forced to block it.

I also failed to compile multithreaded code in the first place in WASM, but I think that’s a completely separate issue. (I didn’t try all these tricks / compiler flags people have suggested so far).

I feel like you can do this without threads all together. Considering your app probably renders every frame anyways, so you can do your own polling / scheduling based on that. Of course, if you don’t want to wastefully render at fixed rate, then unfortunately you need to interact with the browser somehow (write platform specific code).

I’m not sure if sokol has abstractions for this but I’m sure @floooh knows :wink:

Yeah there’s 100% not a need for concurrency here, if I could have thread A say “I am done, switch stack frame to thread B” and thread B do the same the problem would be solved. Thing is I don’t know how to do it (or whether zig 15.2 supports it, I think it doesn’t with the IO stuff coming).
The Reason I used 2 threads is because it’s easy to do what I just described with them.

Of course, if this is hard to do I could just eat my medicine and write the business logic as a finite state machine (tedious to write, but I am confident it’s going to work), this will depend on how easy the solutions I learn here are. (my business logic is due to a re-write because of some breaking changes I made, so I made this post to decide what I will do).

This is something I am trying to get opinions on actually, whether people think that will be easier.

Zig does not support stackless coroutines (yet). Wasm also does not have standard way to do stackful coroutines. It’s possible to do coroutines on browser through JSPI or emscripten asyncify (as emscripten fibers.h seems to do). That lets you suspend and resume a stack frame.

If you want to cheat, you could also write the scheduler in rust as it has stackless coroutines, which then call your zig functions :slight_smile:

I need stackful co-routines, since The challenge is that In the “do stuff” steps I store a lot of information.

The way I see it, I can either try to figure out a cheat to make threads work in WASM and make the sokol thread not block, or I can look into the asyncify stuff, or I can write my logic as a finite state machine.

I don’t have the same requirements for quality on the web version, It will be like a demo you can run on the browser, so if it’s not perfect it’s not the end of the world, the actual product will only run on the users machine.

Stackless really does not mean you don’t have a stack. It just means compiler generates the state machine for you that you would otherwise write manually (using the caller stack).

(emscripten asyncify does this too)

Okay, I didn’t know that. If that’s the case why is there a distinction between stackless and stackful co-routines if stackless essentially have a stack?

So would it be possible to use co-routines (preferably written in C, I didn’t like rust) to make this API work without having to go through the emscripten asyncify stuff?

I’m not aware of any C compiler with stackless coroutines, so no. You still would have to use the emscripten fibers.h api (you can use this from zig too), or the asyncify transformation which is more lower level.

Stackless is bit silly name, but it just means there is no dedicated stack for each coroutine, but rather shared. Stackless is nice because it is purely compiler transformation so it will work anywhere. This is why rust on embedded is so good.

1 Like

Thank you I will keep reading about that stuff