Not usable, but it technically can store and retrieve data, even with file failures. Safety is not there yet, because of the lack of testing – don’t use it, it’s not stable, it’s not safe. It uses an own cache instead of mmap and a classic write-ahead-log. No allocation after initialization. It also uses an own I/O abstraction until std.Io stabilizes.
(Undocumented) Code is here: xash/kvig: key-value store in zig - Codeberg.org
I started this project not only because I wanted to learn more about DBs, but also to try out the new I/O approach. It’s a nice fit: a (small) kv-store should be easy to embed in any project with any I/O usage, but also has some interesting needs to be as efficient as possible. Because the store is not the main program itself that can just throw io_uring at every problem, it has to rely on a good I/O abstraction. So a few observations from building an I/O abstraction that is very close to std.Io, but not quite:
O_DIRECTis needed as a hint for skipping the kernel cache. Otherwise performance tanks a lot. If the system does not support it, can be skipped safely.fsyncis needed as a guarantee for successful writes, withfdatasyncas a further, noticeable optimization. For ACID we don’t need all metadata updates (only when file size changes).syncshould be separated fromwritecalls, we only want the speed penalty ofsyncwhen it is truly needed.- multiple writes: the current std API seems to support writes from multiple buffer into a consecutive block in the file. For WAL-merging the target ranges are all over the file, so multiple writes are necessary, with batching being a performance must. Could pass
offsets: []usizeas a write parameter …
… or, a bit more radical proposal, have all IO operations return a Future. This allows batching from the backends, e.g. uring can submit multiple writes and only yield fibers on await. I did this by changing the IO VTable functions to something like fileSync: *const fn(*anyopaque, file: Io.File) Future(Io.SyncError!void). Blocking implementations can just fill the result of the future directly. Combined with some helper functions (or distinction between File.open and IO.openFile), it can look like this:
// how I used it
const future = file.sync(io);
const result = try future.await(io);
// as a one liner:
const result = try file.sync(io).once(io);
// or an idea to merge the `once` call into the file namespace, e.g.:
const result = try file.sync(io); // just a wrapper for the one liner ^^
// vs
const future = io.fileSync();
There is a non-detectable footgun lurking though, as you can but shouldn’t .once a future multiple times: it cannot change the future because it is *const from being inline (would be no problem with the namespace split idea, though). If there was already a discussion about why this approach wasn’t chosen, I’d be happy for some links. I doubt the current approach with io.async(writePage, …) can be as fast, but I’ll try it out once std.Io.IoUring works again. Changing the I/O calls in KVig is quick.
All in all, if these three things are implemented (and I’d be happy to help out if accepted), the performance hits levels of lmdb, which is to say, it’s very fast, while still being abstract enough for a I/O testing implementation. Cannot wait to implement one – another perfect fit for a DB, to emulate all possible errors and thread races –, but for that I’d like to have decided on a I/O API.
Cheers!