Which parts could be easily improved in a 200+ odd line Zig code?

Here is the 1st cut of WIP Zig Code.

I’d appreciate help of people with more experience, if they can give some pointers (from perspective of data handling and i/o)… primarily on:

  • What would be obvious things to fix for optimal memory management & i/o?
  • If some implementation seems a roundabout way of doing thing… that has a better idiomatic Zig way.
  • Any particular choice (struct design, allocator choice, etc.) is suboptimal in a way, what could have been better.

Postscript

I’m a Zig novice. Have been going through article & videos to understand it a little.

I came across an article with a sample code of LSM in C. Not sure how accurate… but I just tried writing its given implementation in Zig… currently I have a dirty port with few additions.

const CWD = std.fs.cwd();

Calling cwd() at comptime is questionable. It’ll work for linux but probably not windows. I’d call it at runtime in main instead.

By convention, global variables / constant identifiers are written in snake_case, not UPPER_SNAKE_CASE as you’re using.

var GPA_M = std.heap.GeneralPurposeAllocator(.{}){};
const MALLOC = GPA_M.allocator();
var GPA_D = std.heap.GeneralPurposeAllocator(.{}){};
const ALLOCATOR = GPA_D.allocator();

Using 2 of the same allocator is strange to me. It seems like one is supposed to be for memory allocations and the other is supposed to represent disk allocations, but obviously they’re both allocating memory, not disk space. Also the allocator would usually be created inside main, and passed into functions which allocate.

Those are just some obvious things.

1 Like

funny enough trying to cross compile it to windows segfaults compiler with musl)

1 Like

Thanks! I’ll fix the CWD bit and case for identifiers.

I should have added comments for 2 allocators.

I’ve not yet grasped differences for all available allocators… so I was planning on having different allocators in the same run for mem & predisk… and check on what suits better for this.

Also I was planning to find a way to bring in LSMTree.mem onto stack & leave LSMTree.predisk on heap… didn’t yet get around to playing with it.
Just wanted to see… if I increase LSMTree.predisk size by a lot; if it still performs well.

Thanks! I remember reading the FS bit’s comptime woe on Zig docs… but forgot. Would move it.

Generaly you want to put defer _ = gpa.deinit(); at the top so it executes last (defers executed in oposite order).
In your case they are last and won’t be run if any of try throws error. So you won’t be checking memory leaks :frowning:

Oh yes, thanks for catching that.
I did a moved a lot of code around while porting and trying to get rid of bugs… it got leftover.

Also, is there a popular Zig way to manage allocate/deinit calls when they are not getting called in the same block…

Can you give more specific example? Any zig struct with init()/deinit() functions are examples of this since inside of init()/deinit() they call other allocates deinits

For example… in this code itself, I’ve created globals for GPA & Allocator. Then calling allocate/free at different functions and doing deinit back at end of main.

Does it seem alright keeping even Allocator a global, should I init it in main & pass it around as parameter or do something else altogether?
As in how would you have structured it, might be a more apt query.

Initing in main and passing around is that usually done but global is also ok. just put deinit _ = gpa.deinit(); at the top of main.