Hello everyone, I’m trying to play around with basic hot-reloading through dynamic libraries.
I’m struggling a bit with debugging crashes whenever I call a dynamically loaded function with any allocation at all. Even debug printing crashes for some reason. However, I’m able to call the function fine if for example it returns a simple u8.
Can anyone provide any guidance on what I might be missing? For context, I’m using a modified version of the default build script for the exe and shared library. The only major modification is the removal of unneeded parts (static lib for the exe, and the exe for the shared library), and swapping out addStaticLibrary for addSharedLibrary. Trying to step through the code in gdb doesn’t work too. I’m also using the latest available zig version (0.12.0-dev.1856+94c63f31f)
When you export a function it automatically becomes callconv(.C) and function pointers should be const in Zig, so the correct type for the call to lookup is
*const fn () callconv(.C) u8
Changing this does not solve the issue though. I’m not sure what is going on tbh, but I also hadn’t used std.DynLib before this. Hopefully someone more knowledgeable can weigh in.
The Zig standard library test doesn’t seem to be doing anything special, but it is also just loading and executing a simple arithmetic function.
I was able to walk through the dynamic library code in gdb by stepping through assembly instructions one at a time. Calling std.os.write(2, "hello\n") from within _test did seem to invoke the correct system call, but the pointer it passed as the text buffer to write did not point to the string “hello\n” at runtime.
Below is another example of unusual behavior with string literals.
So I tried recreating the code in C and it worked perfectly. Trying the C library with the Zig program caused an ElfHashTableNotFound error. Searching for the cause of this error, I found this comment: DynLib fails to open libGL.so · Issue #5360 · ziglang/zig · GitHub, which recommends to link libc to use its dlopen function instead of Zig’s.
Doing so made the C and Zig libraries both work fine (including stepping through them in GDB), I can see from the discussion that Zig’s dlopen expects some different things vs the libc implementation, but I’m not quite sure why the code blows up when both ends are Zig.
For the sake completeness, here’s the code I tested with:
#include <stdio.h>
extern int foo() {
printf("Hello from C\n");
return 69;
}
Anyway, I think things could be cleared up much more with a practical example somewhere (docs, zig news, ???), I’m down to do it myself but I still don’t understand the root of the problem quite yet. Any pointers from someone knowledgable are appreciated.