Help with getting a simple call to js through emscripten working

Hi :slight_smile:

I’ve been playing around a bit with webassembly - starting with a sokol-zig-imgui-sample which provides a template with sokol and imgui (which uses emscripten).

I’ve managed to get it working using --js-library which involves passing int a javascript file which defines the extern function.

What I’d like to do instead is to do away with the js file and use em_js in emscripten.

Why?

  • Curiosity - I’d like to understand if and how I can get it working
  • I’d rather have a c file in the project than a js file

However, I’ve run into a linker error that I don’t seem to be able to resolve.

error: undefined symbol: jsLog (referenced by root reference (e.g. compiled C/C++ code))
warning: To disable errors for undefined symbols use `-sERROR_ON_UNDEFINED_SYMBOLS=0`
warning: _jsLog may need to be added to EXPORTED_FUNCTIONS if it arrives from a system library
Error: Aborting compilation due to previous errors
emcc: error: '/shri-tmp/zig/p/N-V-__8AAEWFDwBdd7oE1EcCE3lK2y01-2ourXGKhpZeaZxQ/node/22.16.0_64bit/bin/node /shri-tmp/zig/p/N-V-__8AAEWFDwBdd7oE1EcCE3lK2y01-2ourXGKhpZeaZxQ/upstream/emscripten/tools/compiler.mjs -' failed (returned 1)
error: stderr:
error: undefined symbol: jsLog (referenced by root reference (e.g. compiled C/C++ code))
warning: To disable errors for undefined symbols use `-sERROR_ON_UNDEFINED_SYMBOLS=0`
warning: _jsLog may need to be added to EXPORTED_FUNCTIONS if it arrives from a system library
Error: Aborting compilation due to previous errors
emcc: error: '/shri-tmp/zig/p/N-V-__8AAEWFDwBdd7oE1EcCE3lK2y01-2ourXGKhpZeaZxQ/node/22.16.0_64bit/bin/node /shri-tmp/zig/p/N-V-__8AAEWFDwBdd7oE1EcCE3lK2y01-2ourXGKhpZeaZxQ/upstream/emscripten/tools/compiler.mjs -' failed (returned 1)

error: the following command exited with error code 1:
/home/shri/.cache/zig/p/N-V-__8AAEWFDwBdd7oE1EcCE3lK2y01-2ourXGKhpZeaZxQ/upstream/emscripten/emcc -Og -sSAFE_HEAP=1 -sSTACK_OVERFLOW_CHECK=1 -sUSE_WEBGL2=1 -sNO_FILESYSTEM=1 "-sMALLOC='emmalloc'" --shell-file=/home/shri/.cache/zig/p/sokol-0.1.0-pb1HK1HTLQAsjVm5gHGpGI85Rwhwvu2KgCOTH-ZJ9sCS/src/sokol/web/shell.html ./.zig-cache/o/da30d3b095117c29efb96f800e23b36d/libshine.a ./.zig-cache/o/da30d3b095117c29efb96f800e23b36d/libshine.a ./.zig-cache/o/a3332d7b349381aee3e9ec9a0f037be3/libsokol_clib.a ./.zig-cache/o/7a172070ee71a28bcfb28c07fbe28cb7/libcimgui_clib.a -o /shri-tmp/triangle/games/shine/.zig-cache/o/a1cacaef35724976a8454515bb2f39d8/shine.html

Some research led me to this google groups discussion which suggests an issue with how the function is exported (I struggled with really understanding it). I am, however, not clear on whether it really applies to my situation or how to apply the suggestions.

I have forked the template project and added the smallest change to reproduce the problem into a forked repo

can someone here can point me in the right direction?

Thank you

I’m not sure yet why emcc cannot resolve the JS function directly, but here’s a quick workaround. Change the libjs.c file like this:

#include <emscripten.h>

EM_JS(void, _emsc_jsLog, (const char* s), {
  console.log(UTF8ToString(s));
});

void jsLog(const char* s) {
  _emsc_jsLog(s);
}

…basically add a wrapper function in the same compilation unit which calls the EM_JS function.

Then zig build run -Dtarget=wasm32-emscripten and in the browser that opens look into the JS console and there’s your ‘hello from zig’.

When looking at the libJs.o file in the Zig cache with emnm in the original I see:

00000000 d .debug_abbrev
00000000 d .debug_line
00000000 d .debug_str
00000004 D __em_js__jsLog
00000000 D __em_js_ref_jsLog
         U jsLog

E.g. jsLog symbol is listed as ‘undefined’ (which figures because it will live in the .js file which the Emscripten linker step creates).

With the C wrapper function it looks like this:

00000000 d .debug_abbrev
00000000 d .debug_line
00000000 d .debug_str
00000004 D __em_js___emsc_jsLog
00000000 D __em_js_ref__emsc_jsLog
         U __indirect_function_table
         U __stack_pointer
         U _emsc_jsLog
00000001 T jsLog

I don’t know yet why the Emscripten linker finds the JS function when called from the same compilation unit, but not when directly called from the Zig code… need to investigate…

2 Likes

Ok, I think I kinda know what happens, and it’s not Zig specific but also happens in a pure Emscripten project…

First: it works if there’s at least one called C function in the same compilation unit as the EM_JS() function, it doesn’t matter what this C function does, it can be an empty dummy function, but it must be called from somewhere else.

Here’s I think what happens:

The source file which contains the EM_JS() function is compiled into a .o file, which is then assembled into a static link library along with the Zig code. Putting all this stuff into a library is needed because the actual linking is delegated to emcc, e.g. it’s not the typical situation of letting the Zig compiler build an executable.

But notably this static link library does not contain any implementation of the EM_JS() function, it only contains a global C string with the JS function body under a different symbol name (most likely __em_js__jsLog). Most importantly the static link library does not contain any symbol jsLog except for an UNDEFINED entry.

When the linker (in this case emcc) links a static library, it will ignore all .o items in the library which don’t have any symbols which the linker is looking for (e.g. the linker is looking for an implementation of jsLog, but this is not in the library, so it doesn’t pull in the jsLog.o item from the library which contains the global string with the Javascript function body… which then means the Emscripten-magic for EM_JS doesn’t kick in and the EM_JS Javascript function isn’t included in the output .js file which is created next to the .wasm file…

At least that’s my current theory… e.g. sort of a chicken-egg situation…

As a more generic workaround, change your libJs.c file like this (I’ll also explain the EM_JS_DEPS() later:

#include <emscripten.h>

EM_JS_DEPS(bla, "$UTF8ToString");

EM_JS(void, jsLog, (const char* s), {
  console.log(UTF8ToString(s));
});

void dummy(void) {};

And somewhere in your Zig code:

extern fn dummy() void;

…and you need to call that dummy function somewhere in the active code, for instance at the start of main():

pub fn main() void {
    dummy();
    ...
}

…this causes the libjs.o file to be pulled in from the ‘main library’, which means that also the magic EM_JS() global C string is pulled in and the Emscripten linker can do its magic to turn this string into a Javascript function.

Maybe there’s a more straightforward way, but I think this is an acceptable workaround. In a real-world application you would probably have a web_utils.c source file which contains a mix of web-specific EM_JS and regular C functions which would be called from the Zig code, so you wouldn’t need such an empty dummy function for the object file to be pulled in.

2 Likes

PS: the EM_JS_DEPS(bla, "$UTF8ToString"); is technically needed so that the Emscripten linker includes the UTF8ToString() helper function from the Emscripten “JS standard library”. It only works without because the sokol C library also pulls this function into the link step.

PPS: regarding this:

Figuring out a ‘native’ em_js() Zig implementation would be a nice side project, so that JS code could be directly embedded into Zig source files instead of having to use C files just to declare embedded Javascript functions…

The magic is basically that the EM_JS C macro puts the Javascript source code into a specially formatted global C string with __attribute__((section("em_js"), aligned(1))):

I guess the Zig equivalent of this is linksection("em_js").

This macro would be a nice challenge for Zig to prove that it can really fully replace the C preprocessor :wink:

This global C string is then tunnelled all the way through to the Emscripten linker which will search for those special variables, and extract the Javascript function bodies for insertion into the .js file that’s generated alongside the .wasm (but Zig doesn’t need to care about this, all it needs is to generate those special “linksection(“em_js”)” C string variables, the rest is taken care of by the emcc link step.

2 Likes

Hi :slight_smile:

Firstly, thank you for putting together and sharing the sokol zig bindings as well as the template code for sokol+imgui (which I am using as mentioned before). These gave me a real headstart in playing with these tools :smiley:

Secondly, thank you for this suggested fix. It has worked a charm. The thread that I linked (which you were involved in as well) covered something about symbols being exported/undefined etc. I didn’t quite understand what was being explained.

From what you are saying, am I right in understanding that the library it pulls together keeps the individual compilation units separately in there and because nothing in libjs.c was being used elsewhere, emscripten linker was not able to see that object? I hadn’t realised that the library in this case would keep the units separate - and thought that since jsLog, as an extern was used in the zig file, that it would get included.

With your explanation (assuming I understood it correctly), it makes a lot more sense.

I am putting together a blog post of my experience getting these bits working and will share here once complete.

I will also try and see if I can get it working directly in zig - though I it might be a bit too low leve for me.

1 Like

Yes, static link libraries are basically just ‘archive files’ for separate object files (thus the .a file extension).

The linker will only include objects from a static library that contain symbols which are referenced elsewhere, it’s basically a very primitive form of dead code elimination.

This is why some libraries (like MUSL) have each function in its own source file (e.g. see: musl/src/stdio at kraj/master · kraj/musl · GitHub). Each function is then compiled into its own object file and then everything is ‘archived’ into a static library. That way a program linking with MUSL only gets the C stdlib functions linked that are actually used by the program.

Some compiler toolchains can also split large object files into one static library ‘item’ per function so you don’t need to put each function into its own source file but still get per-function dead code elimination during linking, can’t remember the name of that feature in MSVC right now…

And with LTO/LTCG enabled it’s probably different again.

PS: you’re also not the only one stumbling over that problem recently :wink: E.g.: Discord

I will also try and see if I can get it working directly in zig - though I it might be a bit too low leve for me.

The main problem there might be that Zig multiline strings look really ugly, e.g. even if all other problems are fixed, the JS function body would need to to look like this:

em_js(...., 
\\    Module.sokol_beforeunload = (event) => {
\\        if (__sapp_html5_get_ask_leave_site() != 0) {
\\            event.preventDefault();
\\            event.returnValue = ' ';
\\        }
\\    }
);

…which kinda defeats the whole purpose of embedding the JS directly into Zig source files.

1 Like

Thinking about this a bit more… tbh I think for Zig projects putting the JS code into a separate .js file is the better solution…

In C EM_JS is mainly useful for single-file libraries like the sokol headers. Projects can simply copy the C header into their source directory and there’s no separate .js file dangling off which also needs to be copied and integrated into the build.

But in Zig there’s the package manager and build system for dealing with ‘mixed-language-dependencies’, and as a bonus you get proper syntax highlighting and code completion in the .js files, which will never work correctly with mixed-language source files.

It might make sense to make the linking with .js files a bit more ergonomic in the sokol-zig package though (or ideally I eventually want to split the emsdk stuff into its own package).

PS: oth… dealing with such .js file dependencies in sub-dependencies would be quite ugly since all those .js files need to be collected in the top-level project and passed into the emcc link step… with EM_JS() in C source files this problem wouldn’t exist :thinking:

Would be nice if the Zig build system modules would not only have a specialized addCSourceFile() but also a general addForeignLanguageSourceFile() and a feature to gather those extra source files from the depencency-tree of a root-module… would also come in handy for GPU shader source files.

1 Like

I didn’t read through the whole thing but wouldn’t @embedFile() be a reasonable alternative to attaching .js as inline strings?

1 Like

…with one file per JS function… yeah interesting idea… and maybe with meta-information in a JS doc-comment header which could be parsed by a Zig comptime function to ‘reformat’ the JS file content into something that’s compatible with the magic EM_JS string which is generated by the C macro… :thinking:

PS: …with that idea of a ‘comptime parser’ one could probably also put many JS functions into the same .js file and split that up into individual per-function ‘EM_JS’ magic strings…

Sounds exactly like how things are usually done in the JavaScript ecosystem… :zany_face:

1 Like

I have always wondered what .a stood for - archive makes sense.

By and large, I agree. In my case though, my intention is for the function to be only a glue and to defer to a javascript function defined on the web side. e.g.

EM_JS(void, jsLog, (const char* s), {
	Module.jsLog(UTF8ToString(s));
});

This feels like an interesting way to solve it. What might be nicer though is if it automatically wrote some glue to call a javascript function of the same name / parameters that’s in the final web context.

I am setting that up manually now with the glue.

i.e. It would be nice if I could just define the extern function:

extern fn jsLog(ptr: [*]const u8) void;

and it would call the corresponding javascript function on the web side (perhaps on the Module global)

function jsLog(msg) {
  console.log(msg)
}

It would mean that both sides could “just work” - and on the web side, you could write the code in typescript, have any of the frameworks, whatever.

I played around with it a bit, but was quickly out of my depth.

I wrote up a post the two ways I was able to get it working (using the js file and EM_JS with help from @floooh) and the little progress I was able to make with trying to get it working natively with just zig.

I welcome any suggestions / corrections that anyone might have.

1 Like

I’m confused, isn’t this kinda what --js-library (or --pre-js) does, which you said you don’t want to use in your OP?

extern fn jsLog(msg: [*:0]const u8) void;
addToLibrary({
    jsLog(msg) {
        console.log(UTF8ToString(msg))
    },
    jsLog__deps: ["$UTF8ToString"],
})

If you meant that you would prefer to write real JavaScript modules with export/import and whatnot then that’s fair but mostly a limitation of Emscripten being an old dinosaur that was developed before modules.


EM_JS is not possible to implement in Zig at the moment because Zig doesn’t have the ability to export Wasm globals in a way that makes them visible to Emscripten’s tooling. But EM_ASM is possible:

pub fn EM_ASM(comptime code: []const u8, args: anytype) void {
    _ = @call(.auto, emscripten_asm_const_int, .{ CODE_EXPR(code), EM_ASM_ARG_SIGS(args) } ++ args);
}

pub fn EM_ASM_INT(comptime code: []const u8, args: anytype) c_int {
    return @call(.auto, emscripten_asm_const_int, .{ CODE_EXPR(code), EM_ASM_ARG_SIGS(args) } ++ args);
}

pub fn EM_ASM_PTR(comptime code: []const u8, args: anytype) ?*anyopaque {
    return @call(.auto, emscripten_asm_const_ptr, .{ CODE_EXPR(code), EM_ASM_ARG_SIGS(args) } ++ args);
}

pub fn EM_ASM_DOUBLE(comptime code: []const u8, args: anytype) f64 {
    return @call(.auto, emscripten_asm_const_double, .{ CODE_EXPR(code), EM_ASM_ARG_SIGS(args) } ++ args);
}

extern fn emscripten_asm_const_int(code: [*:0]const u8, arg_sigs: [*:0]const u8, ...) c_int;
extern fn emscripten_asm_const_ptr(code: [*:0]const u8, arg_sigs: [*:0]const u8, ...) ?*anyopaque;
extern fn emscripten_asm_const_double(code: [*:0]const u8, arg_sigs: [*:0]const u8, ...) f64;

fn CODE_EXPR(comptime code: []const u8) [*:0]const u8 {
    return withSection("em_asm", code ++ "");
}

fn EM_ASM_ARG_SIGS(args: anytype) [*:0]const u8 {
    comptime var sigs: [args.len]u8 = undefined;
    inline for (&sigs, args) |*sig, arg| {
        const Arg = @TypeOf(arg);
        const bits = @bitSizeOf(Arg);
        sig.* = switch (@typeInfo(@TypeOf(arg))) {
            .bool, .int, .error_set, .@"enum" => if (bits <= 32) 'i' else if (bits <= 64) 'j' else 'p',
            .float => if (bits <= 32) 'f' else if (bits <= 64) 'd' else 'p',
            .@"struct" => |info| if (info.backing_integer != null) (if (bits <= 32) 'i' else if (bits <= 64) 'j' else 'p') else 'p',
            else => 'p',
        };
    }
    return &sigs ++ "";
}

pub fn withSection(comptime section: []const u8, comptime value: anytype) @TypeOf(&declareWithSection(section, value).x) {
    return &declareWithSection(section, value).x;
}

fn declareWithSection(comptime section: []const u8, comptime value: anytype) type {
    const info = @typeInfo(@TypeOf(value)).pointer;
    return struct {
        const x linksection(section) = switch (info.size) {
            .one => value,
            .slice => if (info.sentinel()) |s| value[0..value.len :s] else value[0..value.len],
            else => unreachable,
        }.*;
    };
}
EM_ASM((
    \\const ptr = $0
    \\const width = $1
    \\const height = $2
    \\
    \\const imageData = new ImageData(new Uint8ClampedArray(HEAPU8.buffer, ptr, width * height * 4), width, height)
    \\
    \\const ctx = Module["canvas"].getContext("2d")
    \\ctx.canvas.width = width
    \\ctx.canvas.height = height
    \\ctx.putImageData(imageData, 0, 0)
), .{ rgba.ptr, params.width, params.height });
1 Like

Is this true though? The EM_JS macro just creates a global string variable in the em_js linker section, and I think the same should be possible with Zig’s linksection keyword.

In C code compiled with Zig, the EM_JS macro also works when compiled with wasm32-emscripten (you only need to use emcc for linking), so the under-the-hood machinery is definitely there.

PS: hmm ok, there’s a bit more than just the string, there’s also a WASM-specific definition for the function-import-table, this is probably the part that’s currently not possible in Zig:

void hello (void) __attribute__((import_module("env"), import_name("hello"))); 

__attribute__((visibility("hidden"))) void* __em_js_ref_hello = (void*)&hello; 

__attribute__((used)) __attribute__((section("em_js"), aligned(1))) char __em_js__hello[] = "(void)" "<::>" "{ console.log(\"Hello World!\"); }"; ;

I was experimenting with EM_JS a while ago but I revisited it now and I think I figured it out! From @done-ah’s post:

# From C
- 3: D <__em_js__jsLog> segment=1 offset=0 size=53 [ exported no_strip binding=global vis=hidden ]
# From zig
- 3: D <__em_js__jsLog> segment=1 offset=0 size=53 [ binding=global vis=default ]

Note exported. In C, you can set this flag by decorating a symbol with __attribute__((used)) or __attribute__((export_name("foo"))), and it instructs the linker to include the symbol as a Wasm export.

Zig doesn’t currently have anything equivalent to the exported symbol flag, so you need to pass --export=foo or --export-dynamic (-rdynamic for zig build-*) to the linker in order for the symbol to be included as a Wasm export.

zig build-obj .\z.zig -target wasm32-emscripten -lc
emcc .\z.o '-Wl,--export-dynamic' -o .\www\index.html

With this workaround, you can implement EM_JS like this (using withSection() from my previous post):

pub fn EM_JS(
    comptime ret: type,
    comptime name: []const u8,
    comptime params: anytype,
    comptime code: []const u8,
) *const EM_JS_FN(ret, params) {
    comptime var packed_code: [:0]const u8 = "(";
    const fields = @typeInfo(@TypeOf(params)).@"struct".fields;
    if (fields.len == 0) {
        packed_code = packed_code ++ "void";
    } else {
        for (@typeInfo(@TypeOf(params)).@"struct".fields) |field| {
            // The declared type for the parameter doesn't actually matter;
            // emcc ignores everything except for the parameter name.
            packed_code = packed_code ++ "int " ++ field.name ++ ", ";
        }
        packed_code.len -= ", ".len;
    }
    packed_code = packed_code ++ ")<::>{" ++ code ++ "}";
    @export(withSection("em_js", packed_code), .{ .name = "__em_js__" ++ name });
    return @extern(*const EM_JS_FN(ret, params), .{ .name = name });
}

pub fn EM_ASYNC_JS(
    comptime ret: type,
    comptime name: []const u8,
    comptime params: anytype,
    comptime code: []const u8,
) *const EM_JS_FN(ret, params) {
    return EM_JS(ret, "__asyncjs__" ++ name, params, " return Asyncify.handleAsync(async () => {" ++ code ++ "}); ");
}

fn EM_JS_FN(ret: type, params: anytype) type {
    const fields = @typeInfo(@TypeOf(params)).@"struct".fields;
    var fn_params: [fields.len]std.builtin.Type.Fn.Param = undefined;
    for (&fn_params, fields) |*fn_param, field| {
        fn_param.* = .{
            .is_generic = false,
            .is_noalias = false,
            .type = @field(params, field.name),
        };
    }
    return @Type(.{ .@"fn" = .{
        .is_generic = false,
        .params = &fn_params,
        .is_var_args = false,
        .calling_convention = .c,
        .return_type = ret,
    } });
}

pub fn EM_JS_DEPS(tag: []const u8, deps: []const u8) void {
    @export(withSection("em_lib_deps", deps ++ ""), .{ .name = "__em_lib_deps_" ++ tag });
}
const jsLog = EM_JS(void, "jsLog", .{ .msg = [*:0]const u8 },
    \\console.log(UTF8ToString(msg))
);

comptime {
    EM_JS_DEPS("jsLog", "$UTF8ToString");
}

It’s a bit awkward. Zig doesn’t let you obtain function parameter names via @typeInfo() so it’s not possible to do something like EM_JS("foo", fn (a: i32, b: i32) i32, "code"). Specifying the return type and parameters (as an anonymous struct of types) separately is the best compromise.

3 Likes