Dynamic Memory Allocations in Wasm

Hey there

I wanted to code a WASM module in Zig for a side project, which was supposed to do the calculations for the JS code. After I managed to get a basic WASM file running, I realized that I could not allocate any memory on the heap … because there is no heap (no libc).

After that, I googled for solutions and alternatives and arrived at the current understanding. By default, WASM does not offer any non-stack allocations, and thus I have to know ahead of time how much memory I need (which I don’t …). However, C/C++ (Emscripten), Java, and probably others manage to get around that by providing a code runtime that makes “normal” dynamic malloc allocations possible.

Zig doesn’t have such a runtime (yet?), so I either have to live with the stack or, like described here in previous questions, have to compile my Zig code to .a/.o files (which one?) and then link them with Emscripten, which then results in the .wasm file and the JS glue code.

If there are any misunderstandings in there, please tell me.

My questions would be

  1. Which cpu_arch, os_tag combination is correct ?
    It cannot be “freestanding”, because then I obviously cannot use allocators—and I tried to use emscripten as an OS tag, but that is not valid.

  2. Do I compile to .a or .o files, or is both possible? What are the advantages of one over the other ?
    I would be super grateful for a minimal working example :melting_face:

  3. Is there a chance that the situation improves in the future, either through a solidified Emscripten pipeline or an own runtime, maybe just for memory allocations ?

Thanks
and i am grateful for all input.

Last I worked with wasm I was able to use the std.heap.wasm_allocator just fine.

3 Likes

For a complete Emscripten example you can look at the sokol-zig samples:

…e.g. try zig build -Dtarget=wasm32-emscripten run-cube.

(this will take a while since it installs the Emscripten SDK into the global Zig cache)

This compiles both C and Zig code with the Zig compiler and uses the Emscripten SDK to provide the C library and headers, and for linking via emcc.

For memory allocation on the Zig side, the std.heap.c_allocator should work since Emscripten will link with the Emscripten-provided MUSL anyway (the sokol-samples don’t do any allocation on the Zig side though - but the underlying C library does).

…alternatively also check out the pacman.zig build.zig here which shows how the whole Emscripten stuff works in an upstream project:

…if you don’t need to run in the browser, the wasm32-wasi target is most likely the better option since it works without the Emscripten SDK (and the std.heap.c_allocator should work there too.

PS: for a really minimal starting point without Emscripten but WASI:

hello.zig:

const print = @import("std").debug.print;

pub fn main() void {
    print("Hello World!\n", .{});
}
zig build-exe -target wasm32-wasi hello.zig
wasmer hello.wasm
Hello World!

From there on it’s more or less just “draw the rest of the f*cking owl” :wink:

…for code that’s linked with Emscripten, use wasm32-emscripten instead, and for other cases (e.g. if you do all the web-integration on your own) use wasm32-freestanding (and I guess this is also where the wasm_allocator comes in).

5 Likes

If you insist on doing thing manually, you can increase the among of memory available to WebAssembly using @wasmMemoryGrow. That’s how std.heap.wasm_allocator works. The change will cause the ArrayBuffer on the JavaScript side to become detached. All views referencing the buffer must be recreated.

2 Likes

Thank you all, ill try to work with all of that :+1:

If you can, I would suggest avoiding Emscripten and instead favor freestanding which is much simpler and more flexible; you can compile your Zig code immediately to a .wasm binary that you can then load from JavaScript whichever way you like.

Emscripten is an old dinosaur full of ugly annoying warts and mainly makes sense if you want to write interactive multimedia applications like games without writing a non-trivial amount of Wasm-to-browser-API glue code yourself. If you’re doing raw number crunching, Emscripten doesn’t really offer anything of value.

When targeting Wasm and doing dynamic memory allocation there are two things that are important to consider:

  • Who manages the heap?
  • How are global variables, the stack and the heap laid out in linear Wasm memory?

If you’re targeting wasm32-freestanding, it’s up to you to manage the heap, which is most easily done by using std.heap.wasm_allocator.

If you’re targeting wasm32-emscripten or wasm32-wasi and linking with libc (which you should, it doesn’t really make sense to not link libc when compiling for those targets), it’s the libc that manages the heap. You should use either std.heap.c_allocator or std.heap.PageAllocator which will allocate everything via the libc. You should not use std.heap.wasm_allocator because then you run into the risk overwriting the libc-managed heap (more details below).

Unfortunately, std.heap.page_allocator (which is also the default backing allocator used by std.DebugAllocator) is an alias for std.heap.wasm_allocator when targeting any wasm32 targets, even when linking libc, which is a mistake and a frequent source of pains and confusion. The Emscripten toolchain might also define the Wasm memory as non-growable, which will cause std.heap.wasm_allocator to fail. Be wary of this if you decide to target Emscripten or WASI.

Regarding how global variables, the stack and the heap is laid out in memory, there are compiler/linker options like --stack [bytes], --global-base=[bytes] or --initial-memory=[bytes] that you can use to fine-tune the layout, but by default a Zig program targeting wasm32-freestanding with no global variables and dynamic memory allocation will reserve a 1 MiB stack (16 pages * 64 KiB each) that is anchored at the end of that 1 MiB chunk of memory and grows downwards:

   +----------+
   | <- stack |
   +----------+
   |          |
   0       1048576
page 0     page 16

If your program has (mutable) global variables, more memory is reserved after the stack and is used for globals:

   +----------+------------+
   | <- stack | globals... |
   +----------+------------+
   |          |            |
   0       1048576      1114112
page 0     page 16      page 17

The first time your program allocates something using std.heap.wasm_allocator, it uses @wasmMemoryGrow() to request another Wasm page and starts allocating from that page, which ensures that it won’t clobber the stack or global variables:

   +----------+------------+----------------------------+
   | <- stack | globals... | std.heap.wasm_allocator -> |
   +----------+------------+----------------------------+
   |          |            |                            |
   0       1048576      1114112                      1179648
page 0     page 16      page 17                      page 18

If you’re targeting Emscripten or WASI and linking with libc, the libc may manage the heap differently. I believe most Wasm linkers define a special __heap_base symbol that points to the end of the globals and marks the start of the heap, which often ends up someplace in the middle of the same page as the globals.

   +----------+----------------------+
   | <- stack | globals... , heap -> |
   +----------+----------------------+
   |          |            ^         |
   0       1048576         |      1114112
page 0     page 16         |      page 17
                           |
                      __heap_base

You might be able to see why std.heap.wasm_allocator and the libc heap might clobber each other if the latter grows into the next page (obviously it depends on exactly how the allocators are implemented). You might also run into similar memory-corruption problems if you are using multiple different allocators that each assume they fully own certain ranges of memory, or if you’re haphazardly writing to arbitrary Wasm memory from JavaScript.

4 Likes

Thank you soo much. I partly pieced that together myself after I asked the question and looked at the wasm_allocator code, but it’s good that I now know that my understanding is correct.

You can use either the wasm_allocotor or DebugAllocator which last time I checked uses the wasm_allocator as the page allocator when targeting wasm.