Converting array of strings to C strings without overflowing

I have a list of structs, where each struct has a name field of type []const u8. This is basically a list of items in a folder, so the name contains file/directory name.

I populate the structs like this, copying the strings and returning the owned slice from an ArrayList:

   var list = ArrayList(FileSystemEntry).init(allocator);
...
    var iterator = i_dir.iterate();
    while (try iterator.next()) |path| {
        try list.append(.{
            .name = try allocator.dupe(u8, path.name),
            .kind = path.kind,
        });
    }

    return list.toOwnedSlice();

At this point everything is find and the caller of this function can read the list correctly.
But then I need to pass each name to a C function that expects a const char *text.

By reading this forum I found that I can simply do this for conversion:

const c_str: [*c]const u8 = item.name.ptr;

But when I print out the contents of c_str I get something that looks like multiple name fields glued together. Looks to me like an overflow of some sort. Like a pointer points at the beginning but the end of the C string is not the end of the item.name.

Here’s an example of the output (I loop through all the items and print a Zig string and a C string):

zig-cache >>> zig-cachebuild.zig.gitmodules.gitignore
build.zig >>> build.zig.gitmodules.gitignore
.gitmodules >>> .gitmodules.gitignore
libs >>> libs.gitsrc
zig-out >>> zig-out
.gitignore >>> .gitignore
.git >>> .gitsrc
src >>> src

I’m pretty sure this is my lack of understanding of memory management, so I would be very grateful if someone could point out my mistakes (or at least nudge me in the direction :slight_smile: ).

1 Like

The problem here is that C-strings are null-terminated.
Instead of using a length, many C functions just read the memory until one of the bytes is '\x00' and take that as the end of the string.

Zig provides functions for that kind of stuff. You can for example use allocator.dupeZ which returns a [:0] u8.
Note that you also need to store it as [:0]u8 or [:0]const u8. If you don’t do it then functions like allocator.free will not know that there is an extra byte in the allocation, and freeing the memory will cause undefined behavior.

You can then extract the raw pointer like this:

const c_str: [*:0]const u8 = item.name.ptr;

Note that I used [*:0]const u8 here because it adds more type-safety than [*c]const u8(which can basically mean anything).

3 Likes

Thank you so much @IntegratedQuantum !

After your explanation now it’s pretty obvious. Since I’m going to use the name in a bunch of C functions anyway, I might as well store it as [:0] const u8 and use allocator.dupeZ to copy data into it.

Now it all works. And I’ve also learned that it’s better to use specific types than [*c].
:pray:

3 Likes