Zig is my first language where there is more to performance optimization than just using microbenchmarks or using a profiler for an interpreted language. I’m getting familiar with the different aspects of optimization such as cache coherency, branching, and SIMD.
However, I’m not familiar with any of the tools for making actual measurements for these things? How can I measure cache hits vs cache misses or how often I’m reading from L1, L2, L3, or main memory. I’m currently developing on an M1 Mac and I’m looking for any good resources that are specific to benchmarking on this machine. Most of my searches point to using tools like
*trace on linux. I’m also not sure which tools may be specific to C/C++ or if they are relevant for any compiled machine code.
My project for testing this out is implementing the deflate compressor/decompressor in Zig and there are quite a few hot loops that I’d like to have more insight on.