Best practices for microbenchmarking

matklad · December 4, 2025, 4:22pm

I want to write a micro benchmark for a tiny utility at TigerBeetle, a unit-test of a benchmark. I think I know what I want, but I don’t see how can I achieve all that I want without hacks. Curious to hear if anyone solved similar problems before.

The purpose of the benchmark is ad-hoc sanity checking. E.g., if I touch the code with a refactor, I want to be able to include before/after in the commit message. Or, when upgrading the compiler, I want to sanity check that the performance didn’t regress.
I specifically do not want to do continuous benchmarking, graphs, automated regression detection or the like. I just want one number when I ask for it.
I do not want to touch my build.zig. If I have 10 microbenchmarks, I don’t want to create 10 separate Zig modules/binaries just so that I can run them.
I want to run then benchmarks with small size in debug mode every time I run the tests, I want to prevent benchmarks from bitrotting
Really, I just want to use test "binary_search: benchmark" { as my interface, it’s just perfect for that, the same way that fuzz tests re-use the same interface and just run passed in corpus.
In “test” mode, I want benchmark tests to be silent.
In “benchmark” mode, benchmarks should display results on stderr. It’s up to the benchmark to display results
I also want to be able to get a binary which runs a single benchmark only, such that I can plug the binary into poop if I am curious about some metrics beyond just time.
But I still want the benchmark to make its own internal measurements, such that it can separate setup costs from the actual benchmarking loop costs.

I think the surface API for this thing could look like this:

const bench = @import("../support/bench.zig");

test "benchmark: binary_search" {
    const gpa = std.testing.allocator;

    var b = bench.init();
    defer b.deinit();

    const element_count = switch (b.size()) {
        .smoke => 128,
        .default => 10_000_000,
        .explicit => |n| n,
    };

    var array = generate_array(gpa, element_count);
    defer gpa.free(array);

    var searches = generate_searches(gpa, array);
    defer gpa.free(searches);

    var hash: u32 = 0;

    {
        b.start();
        for (searches) |key| {
            hash += binary_searhc(array, key);
        }
        b.finish();
    }

    b.print("hash {}", .{hash});
    b.print("elapsed {}", .{b.elapsed});
}

Which you chan then run as zig build test -- "benchmark: binary_search".

On the build.zig side, we spy on the test filter name, and, if it includes benchmark, we inject options into the build saying “we are in the benchmark mode”

On the bench.zig side, we look at those options to see if we are in unit-test or benchmark mode. If we are benchmarking, then bench.init flips std.testing.log_level to info (and deinit flips it back). bench.print is log.info in disguise. And b.size is .smoke in test mode, .default in benchmark mode, and -Dbenchmark_size=1_000_000 if that is passed on the CLI.

Is there a better way to do what I want here?

ww520 · December 4, 2025, 4:52pm

I sometimes want to do some quick and dirty benchmarking and just do something like the following.

https://github.com/williamw520/misc_zig/blob/main/simd_max_index.zig#L178-L244

matklad · December 4, 2025, 4:53pm

Yeah, that’s also my thought, but then, the output always goes to stderr, but I only want to see the output when I am benchmarking, and not just running the tests.

matklad · December 4, 2025, 4:56pm

Well, let’s just live code that, shall we?

ww520 · December 4, 2025, 5:04pm

For controlling what tests to run during development, I usually put tests in different files, one file for one functional area. Benchmark would go into a separate file. And import them in one master file. Then I can comment out the test files I don’t want to run during development; this helps to narrow down debugging when making changes to one functional area. E.g.

zigjr/src/zigjr.zig at master · williamw520/zigjr · GitHub

...

test {
    // _ = @import("tests/request_tests.zig");
    // _ = @import("tests/response_tests.zig");
    _ = @import("tests/message_tests.zig");
    // _ = @import("tests/frame_tests.zig");
    // _ = @import("tests/stream_tests.zig");
    // _ = @import("tests/rpc_dispatcher_tests.zig");
    // _ = @import("tests/json_call_tests.zig");
    // _ = @import("tests/misc_tests.zig");
    // _ = @import("streaming/BufReader.zig");
    // _ = @import("streaming/DupWriter.zig");
}