I am writing some zig bindings (zenoh-zig) for a c library (zenoh-c).
I was planning on the following workflow:
Use the build system to depend on the pre-compiled c static libraries (done)
Port the examples over to zig using @cImport() (in progress)
Use learnings from porting the examples to write an easier to use zig API
Is this a good workflow?
An alternative work flow could be:
Use translate-c only once, and modify the generated zig code. Will this be easier or harder to maintain as the upstream changes?
Questions:
How do I know what I can delete from the translate-c output?
Is there a guide somewhere for writing bindings for C code?
The library is originally written in rust, There is a bunch of loanā¦moveā¦stuff that I donāt understand. Where can I learn what this means?
Here is a snippet of what ported code looks like. Note: this code does not compile because I coudnāt figure out how to compare two [*c], this is absolutely brutal.
const std = @import("std");
const zenoh = @import("zenoh");
test "wrapping raw bytes into a z_bytes_t" {
var payload: zenoh.c.z_owned_bytes_t = undefined;
const input_bytes: []const u8 = &[_]u8{ 1, 2, 3, 4 };
var output_bytes: zenoh.c.z_owned_slice_t = undefined;
_ = zenoh.c.z_bytes_copy_from_buf(&payload, input_bytes.ptr, 4);
_ = zenoh.c.z_bytes_to_slice(zenoh.c.z_bytes_loan_mut(&payload), &output_bytes);
try std.testing.expectEqualSlices(u8, input_bytes, zenoh.c.z_slice_data(zenoh.c.z_slice_loan(&output_bytes)));
}
I would try to automate the process by extracting the C API declaration either via clang -Xclang -ast-dump=json -c [c_src_path] or using Zigās comptime reflection to iterate over the @cImportāed struct (disclaimer: I havenāt tried the Zig approach yet - in recent times at least).
With that extracted AST information you can then make your Zig output to be more āZig idiomaticā (at least naming convention, but maybe also things like accepting slices instead of ptr/size pairs etcā¦).
Hereās how I generate a simplified JSON from Clangs very verbose AST-dump (I put some restrictions regarding the C API on myself to simplify the data extraction though - this is easy because I also control the C API, for instance nested unnamed structs are not allowed, and no unions in general):
ā¦the gen_zig.py script then takes this āir.jsonā as input and generates a Zig module:
ā¦and the result then looks like this (for a somewhat simple C API):
or using Zigās comptime reflection to iterate over the @cImport āed struct
Not sure if your idea was to use this reflection to generate Zig source code and check that into Git, or to fully build the binding API at comptime. If the latter, this approach might produce an API thatās hard for programmers and ZLS to inspect. At least IME, ZLS tends to give up with many comptime constructs and itād suck if thereās no IDE support for types and functions when using the bindings API. Your current approach doesnāt have any of these downsides.
ā¦the first option, e.g. it would be a separate Zig exe which @cImports the C header, uses comptime code to iterate over the reflection info in the imported struct (which would replace the clang-ast-dump step), āziggifyā the C API and then write a Zig module output file which would be committed to git.
E.g. quite similar to the current approach, but replacing Clang and all the Python code with Zig.
Currently I have generated the zig code using translate-c and I am hand editing it to remove the [*c] and replace with *T wherever I can.
What happens if I screw up? Will I get a compile error? What happens when my dependency changes, will I get a compile error then? How do I know that my bindings are accurate?
nope, if you are lucky segfault or some other runtime error. This is an example of how unsafe C is.
when interfacing with C you have to deal with it, cause there is no way zig can know what kind of pointer it should be thats why [*c] exists
depends on how it changes.
If the signature of types/functions change in a way thats incompatible you should get compile errors.
If it decides to use pointers differently without the above then segfault if your lucky.
You need to track the changes and make your own if necissary.
Again if signatures are inaccurate should be a compile error, beyond that itās up to you to rtfm.
No, all sorts of weird things can happen from parameters arriving with wrong values on the Zig side, to memory corruption, to āimpossibleā crashes. Thatās why it is a good idea to not generate bindings manually
In my bindings I generate a Zig wrapper function which then calls the ārawā C function, this Zig wrapper function can then do any type conversions or āsafe castsā if needed (mostly the Zig function just straight up calls the C function though - but at least this gives an additional safety layer if the args of the Zig wrapper function are not type-compatible with the C function args there will be a compile error)
The āmovedā and āownedā stuff I think comes from a rust style. It looks like I can just delete all the āmovedā variants and just have āOwnedā. What utility do these āmovedā types offer?
Take a look at z_query_reply_err, if I change the signature from [*c]MovedBytes to *OwnedBytes I donāt think I am losing any infomation here. Because in zig, all function parameters are constant. I might as well rename OwnedBytes to just Bytes and delete MovedBytes and z_bytes_move?
Iāve already gotten the translate-c source down to about ~2400 (from 5000) lines by just deleting duplicated type information and the extra C stuff.
I could write some code-gen to translate the headers better but honestly that code-gen would probably also be ā¦ 2000 lines? I think when I am finished hand editing it will be close to 1000 lines. It doesnāt seem wise to embark on a code-gen journey for the first round of bindings, maybe on the second go-around.
I think I change my mind about code-gen.
All Iām really doing is a bunch of find and replace operations. No reason I cant make a ZON file of those and just apply them serially to an array list of the code. Then when I update the dependency I can see the diff and add more / adjust if needed.
Maybe even run ast-check between each operation etc.
Pipeline could look like:
translate-c build step
apply serial find and replace operations as described by ZON file (deletions are just replace with \n)
maybe run ast-check / zig fmt a few times
output file
compare diff manually using git diff each time the dependency is updated
profit
I hope I donāt get addicted to this and start writing compilers.
Alight I think Iāve got some decent bindings getting generated, and a path forward for updating dependency as needed. Now to the nitty gritty: writing an idiomatic API.
Here is an example. The c library has this config object, that is used to initialize a networking session:
Is there a better way to approach this? Not sure how far I really wany to go down the code-gen route. I donāt think I want to codegen these idiomatic wrappers.
So it turns out commiting generated files was a bad idea. The underlying c library has different header files depending on the architecture (the sizes of some opaque types changes between 64bit and 32 bit arches). What a pain.
Guess I need to fully generate inside the build system.
Would be interesting to know what causes the different sizes, and why those size differences are not automatically reflected on the Zig side.
For instance size_t/ssize_t/uintptr_t/intptr_t on the C side are 32- or 64-bits, but so is usize/isize on the Zig side. Same with pointers.
E.g. if you take a C struct, and ātranslateā that to a Zig extern struct with the types mapped via translateC or manually via this mapping table Documentation - The Zig Programming Language the Zig-side types should be compatible with the C side types on the same computer, even if the the size is different for 32- vs 64-bit CPUs.
The issue with my first approach was I was using translate-c on header files supplied with the library that were for a specific architecture, and then trimming the output after that using find and replace operations.
Specifically, the dependency zenoh-c uses genereated header files, for example:
So the translate-c output will be different depending on the arch, so I cannot commit the output of translate-c into my repository (it will only be valid for a single architecture).
I have changed my pipeline to just directly use the translate-c output and I am just importing it now in the zig code:
And I was calling them opaque because they were in a file called zenoh_opaque.h (the c library is making some types āopaqueā by hiding their contents as just bytes).