Hello everyone! Glad to be part of this community.
I’m working on a Zig library focused on parsing large CSV files. While I could keep adding features and expanding the API, I’ve reached a stage where I really need to get serious about tracking performance and memory usage—especially to avoid unnecessary allocations.
Has anyone here profiled their Zig code extensively, or know of any repositories that have good examples of profiling and benchmarking setups? I’d really appreciate concrete examples, best practices, or even just tips on how you approached this in your own projects.
Thanks in advance for any pointers or links!
Extra relevant info: This is the library I am building → repo and I use MacOS.
3 Likes
Could try valgrind --tool=massif
and massif-visualizer (unsure what the status of macos support is with valgrind, though)
EDIT: Might not be relevant for your use case, but another random thing I’ll mention is using a custom Allocator to enforce certain properties of your program’s heap usage (e.g. can’t allocate over a certain size [the example in the linked post will need to be updated for newer Zig versions])
6 Likes
you rock mate. thanks for taking the time to answer
2 Likes
For profiling applications, I also had great success with GitHub - koute/bytehound: A memory profiler for Linux..
For a Zig library though, my first instinct would be not to measure it, but rather think from the first principles what would be the simplest&most efficient memory allocation strategy, and then make allocation in the library so simple that it doesn’t need tracking.
For example, if my goal is parsing large CSV files, it probably means I am going to parse them row-by-row, in a streaming fashion. But that means that the only memory I need is a single buffer to hold a row. Then I can make an API that takes this buffer as a parameter, and leave allocation to the caller.
10 Likes
Thanks for taking the time mate, people like you really make participating in this forums a great experience.
Liked your suggestion, I do have two kinds of parsing though, the most straight forward one (the one you mentioned) and a “faster-ish and more efficient” one, at least on paper, using a state machine. But i would like to provide metrics on that you know? to actually know if it is faster or not at all.
I stoped completely adding new stuff to the lib API and will focus 100% on perf, so for sure will re-look my code and follow your suggesiton.
You rock mate! thanks
1 Like
I use Tracy for this. It provides sampling and zone based profiling so you can check if your code is getting faster or not, and supports tracking allocations so you can check where the allocations are happening how long they’re taking and if that correlates with performance.
My Tracy bindings provide an allocator that you can wrap your Zig allocators in. In a normal build it will return them unchanged, but in a -Dtracy
build it will return an allocator wrapper that notifies Tracy whenever you alloc/free/etc:
const gpa = tracy.Allocator.allocator(std.heap.smp_allocator);
(It won’t do things like tell you about cache misses or such though, at least AFAIK, you’ll need other tools for that type of analysis.)
10 Likes
Thanks, seems a good option.