Code review: mediocre pun generator

edwardloveall · April 12, 2025, 10:11pm

I’m learning zig (coming from ruby/js/little bit of rust) and I’m looking for feedback on my first project not driven by a tutorial. I used How to Learn a New Programming Language and step 14 says:

Ask someone who knows the language better than you do to critique your code. Don’t skip this one.

So I made girls_just_want_to_have_memory_managed_puns and here I am looking for critique. I’d love to hear about:

Idioms that I’m nailing or failing
Memory management (am I leaking memory or dangling a pointer somewhere)
Standard library tools that I missed
Anything else!

LucasSantos91 · April 12, 2025, 11:38pm

Cool program. Quite good for a first program.
The only part that would really have to change is needing to copy the txt file into the bin folder. There are two ways to address this:

Use the build system to copy the file:

// build.zig
b.installBinFile("wikipedia-idioms.txt", "wikipedia-idioms.txt");

Bake the file directly into the program using @embedFile:

diff --git a/build.zig b/build.zig
@@ -15,17 +15,18 @@ pub fn build(b: *std.Build) void {
     // set a preferred release mode, allowing the user to decide how to optimize.
     const optimize = b.standardOptimizeOption(.{});

+    const idioms = b.createModule(.{
+        .root_source_file = b.path("wikipedia-idioms.txt"),
+    });
+
     const exe_mod = b.createModule(.{
         .root_source_file = b.path("src/main.zig"),
         .target = target,
         .optimize = optimize,
+        .imports = &.{.{
+            .name = "wikipedia-idioms.txt",
+            .module = idioms,
+        }},
     });

diff --git a/src/main.zig b/src/main.zig
@@ -53,13 +53,7 @@ pub fn main() !u8 {
     const best_rhymes = try collectMaxScoreRhymes(allocator, rhymes);
     defer allocator.free(best_rhymes);

-    const idiom_size = 1024 * 1024;
-    const idioms = try fs.cwd().readFileAlloc(
-        allocator,
-        "wikipedia-idioms.txt",
-        idiom_size,
-    );
-    defer allocator.free(idioms);
+    const idioms = @embedFile("wikipedia-idioms.txt");

     const idioms_with_rhyme = try matchedIdioms(allocator, idioms, best_rhymes);
     defer allocator.free(idioms_with_rhyme);

@@ -124,7 +117,7 @@ fn collectMaxScoreRhymes(allocator: mem.Allocator, rhymes: []Rhyme) ![]Rhyme {

 fn matchedIdioms(
     allocator: mem.Allocator,
-    idioms: []u8,
+    idioms: []const u8,
     best_rhymes: []Rhyme,
 ) ![]PhraseWithRhyme {
     var idiom_lines = mem.splitScalar(u8, idioms, '\n');

Note the last diff, idioms should take [] const u8, there’s no reason to take it mutably. If you’re going to bake the file into the binary, this becomes mandatory.

LucasSantos91 · April 12, 2025, 11:59pm

The GeneralPurposeAllocator has been renamed to DebugAllocator. It’s useful when debugging, because it catches leaks, but has bad performance. In release builds, it’s better to use something different. It has been recomended to use the SmpAllocator, but it’s mostly focused on concurrency. In my project, I just use c_allocator. It depends on libc, so in your build.zig you’d have to declare this dependency. Here’s how I’ve been doing in my codebase (a bit convoluted, I’ll admit):

const builtin = @import("builtin");
const Gpa = if (builtin.mode == .Debug)
    std.heap.GeneralPurposeAllocator(.{})
else
    void;
fn initAllocator() Gpa {
    return Gpa{};
}
fn deinitAllocator(gpa: *Gpa) void {
    if (builtin.mode == .Debug)
        _ = gpa.deinit();
}
fn getAllocator(gpa: *Gpa) std.mem.Allocator {
    return if (builtin.mode == .Debug)
        gpa.allocator()
    else
        std.heap.c_allocator;
}

pub fn main() !u8 {
    var gpa = initAllocator();
    defer deinitAllocator(&gpa);
    const allocator = getAllocator(&gpa);

LucasSantos91 · April 13, 2025, 12:22am

You can save one allocation here:

const url = try fmt.allocPrint(
    allocator,
    "https://rhymebrain.com/talk?function=getRhymes&word={s}",
    .{word_arg},
);

By creating a buffer on the stack with enough size for the prefix, which is known at comptime, and some extra space for the word. You’d just have to put an upper bound on the size of the word.
In fact, if you’re willing to dig a bit into how Uri is implemented, you could probably do most of the parsing at comptime, and during runtime you’d just fill some fields with the word the user provided.

Another allocation that you can save is here:

var reader = req.reader();
return try reader.readAllAlloc(allocator, body_size);
// ... later
const parsed = try json.parseFromSlice([]Rhyme, allocator, body, .{});

You dumped all the data from the uri into a buffer, and then immediatelly parsed the buffer. It’s more efficient to take the reader from connection and pass it to json, using std.json.reader.

AndrewKraevskii · April 13, 2025, 8:48am

In low-level languages where you have control over memory, it’s desirable to figure out if you can drop the memory management aspect to make your life easier and the program faster.

Since it is a short-lived program, there is actually no need to free at all during runtime (and freeing memory would only slow the program down).
But doing so can trigger tools like Valgrind, which is not desirable in debug mode. Using std.heap.ArenaAllocator solves issues with Valgrind by freeing all memory at the end.

Another example is the collectMaxScoreRhymes function. It appends elements in a loop repeatedly; each append can potentially fail, and it requires you to defer deletion of the array.
However, it is pretty trivial to calculate the size of the array beforehand. This allows reducing potential failure points to just one.

fn collectMaxScoreRhymes(allocator: mem.Allocator, rhymes: []Rhyme) ![]Rhyme {
    var max_score: u16 = 0;
    var count: usize = 0;
    for (rhymes) |rhyme| {
        switch (std.math.order(rhyme.score, max_score)) {
            .eq => count += 1,
            .gt => {
                max_score = rhyme.score;
                count = 1;
            },
            .lt => {},
        }
    }
    var array_list = try std.ArrayListUnmanaged(Rhyme).initCapacity(allocator, count);

    for (rhymes) |rhyme| {
        if (rhyme.score == max_score) {
            array_list.appendAssumeCapacity(rhyme);
        }
    }
    std.debug.assert(array_list.capacity == array_list.items.len);

    return array_list.items;
}

edwardloveall · April 13, 2025, 7:16pm

Thank you both so much! I’ve made a bunch of changes based on your suggestions and learned whole lot. Definitely glad I asked.