My first zig library - looking for critics

shaozi · July 9, 2024, 8:56pm

Just finished my first library: cowsay. Learnt a lot about zig and many thanks to people (@dimdin) in this forum.

Looking for comments and critics. Did I follow the conventions and best practice? Is there something smells?

+------------+
| Thank you! |
+------------+
       \  ^__^
        \ (oo)\_______
          (__)\       )\/\
              ||----w |
              ||     ||

AndrewCodeDev · July 9, 2024, 9:07pm

I can spot one thing right away that is best to avoid:

var buffer = std.ArrayList(u8).init(std.heap.page_allocator);

Avoid using the std.heap.page_allocator directly. It’s best to take an allocator as a parameter and use something like the GeneralPurposeAllocator for what you’re trying to do here.

The page allocator is great for building other allocators, but it’s a very raw piece of machinery that you can run into problems with later on.

Also, let’s say that you wanted to log where all your allocations occur in your application. You can do that with a logging allocator quite easily if you are taking in allocators as parameters: zig/lib/std/heap/logging_allocator.zig at master · ziglang/zig · GitHub

Also, the GPA has a deinit function that returns information that you can use to check if you have leaked memory.

shaozi · July 9, 2024, 9:34pm

Nice to know! found some sample code on GeneralPurposeAllocator here. Will take a look!

andrewrk · July 9, 2024, 9:35pm

2 things stand out to me:

You can rename cowsay.zig to Cowsay.zig and put the struct fields directly in the file, because all Zig source files are structs.

It is against conventions in Zig to allocate memory inside a module without accepting an Allocator parameter. It’s also generally unnecessary to perform heap allocation inside print functionality. A more ziggy implementation would figure out how to avoid heap allocation altogether in this functionality.

shaozi · July 9, 2024, 10:08pm

Aha! I was wondering what is the difference between a bare zig file and a struct inside it. Do you have an example on use a zig file as a struct?

shaozi · July 10, 2024, 12:31am

Wow! This is so cool! All I need to do is to remove the struct brackets, and add a const Self = @This(); ! It is so clean now!

shaozi · July 11, 2024, 4:38am

@AndrewCodeDev I added an allocator as a field so user can change it. I also use GPA as the default value as the allocator field. Here: cowsay/src/Cowsay.zig at d76ffb04913641b9aaa6f0f2953bcbffdc1502da · shaozi/cowsay · GitHub

Does this matches your suggestion? I feel there is a better way, but I don’t know. People say the GPA as a global variable is questionable. I feel the same way, but just don’t know a better way.

AndrewCodeDev · July 11, 2024, 6:29am

Closer! First, good on you for sticking it out. Thinking about design, even for simple projects, will help you grow as a developer in this language.

There’s one very simple change I would recommend… in your main function, you create your Cowsay object on this line:

line 27:  var cow = Cowsay{ .w = stdout.any() };

Right there is where you should pass the allocator in. Preferably, it should look something like so:

// make your allocator in main.zig
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
// pass the allocator to Cowsay
var cow = Cowsay{ .w = stdout.any(), .allocator = gpa.allocator() };

Okay, so why am I making a big deal out of this? Think about our options here:

It’s part of the module, but it’s public
It’s part of the module, and it is not public
It’s handed to the module from outside

Let’s go through our options and think about the costs.

For option 1, Think about it like a security model. Each user should have the minimum amount of privileges required. If you make it public, that means that other modules that import it could use that allocator too (why not, it’s public, after all). You have to answer the question “who owns this allocator?” and that becomes more important as you explore different allocator options.

On option 2, you are controlling the allocator from the implementation. For simple projects, this is hard to debate and is kind of pedantic. Once you get to anything more complex, it becomes a problem. What if I wanted to pass a logging allocator to the implementation to see what what allocations are actually made? I can’t because that decision was made for me.

For option 3, you get what you needed in one package - the user chooses and it’s clear who owns the allocator… in this case, main owns the allocator and it’s main’s job to call deinit to cleanup the resources. It’s clear, simple, and allows the user to have options.

Should you ever make choices about an allocator? Sure, it happens in the standard library actually in the json module. They use an arena allocator depending on the circumstance because they’ll use that to clear out the memory when they’re done. That said, it’s still wrapping an allocator that the user provides. You can see that here…

pub fn parseFromTokenSource(
    comptime T: type,
    allocator: Allocator,
    scanner_or_reader: anytype,
    options: ParseOptions,
) ParseError(@TypeOf(scanner_or_reader.*))!Parsed(T) {
    var parsed = Parsed(T){
        .arena = try allocator.create(ArenaAllocator),
        .value = undefined,
    };
    errdefer allocator.destroy(parsed.arena);
    parsed.arena.* = ArenaAllocator.init(allocator);
    errdefer parsed.arena.deinit();

    parsed.value = try parseFromTokenSourceLeaky(T, parsed.arena.allocator(), scanner_or_reader, options);

    return parsed;
}

I’ll also add that it makes it explicit that “this thing allocates”. If you don’t do that, it allocates in the background silently. Again, for simple projects, you won’t lose sleep over it. For anything more advanced, that kind of thing can really matter.

The point being, allocation is a touchy subject (to put it mildly). If you can, it’s best to let the user decide what kind of allocation strategy they want to use (aka, they can hand different allocators to your Cowsay object). That’s part of why we program in Zig - we care about things like allocation