Hey everyone! I’ve been slowly hacking away at a Torch like library for Zig.
This is an on going passion project of mine that I’m excited to be share. It’s a combination of Cuda, C, C++, and of course, Zig.
Metaphor is intended to be a torch-like library for Zig. The goal is to have a simple syntax that feels pythonic without sacrificing low level control.
Mixing Zig with Cuda
Metaphor is entirely GPU driven and it’s focused on working with large data.
After a lot of tinkering, I believe I have found a balance between exposing implementation details to keep the includes manageable and writing kernels easy.
The library is inherently multi-stream. Similar to multi-threading, streams act like work queues that can be loaded and launched asychronously.
Example:
Everything in Metaphor works with streams, so to get started, we initialize our GPU context, obtain a stream, build a graph, and allocate some tensors.
const mp = @import("metaphor");
// Initialize device and cuda context on device zero
mp.device.init(0);
const stream = mp.stream.init();
defer mp.stream.deinit(stream);
const G = mp.Graph.init(.{
.stream = stream,
.mode = eval,
});
defer G.deinit();
// CUDA tensors have analogous datatypes to the CPU, but
// with some implementation differences for 16bit floats
// to reduce bus traffic, freed memory is cached for reuse
// can free individually but will also be freed on G.deinit()
const X1 = G.tensor(.inp, .r32, mp.Dims(2){ 2, 2 });
const X2 = G.tensor(.wgt, .r32, mp.Dims(2){ 2, 2 })
The math operations are straight-forward, as is the reversal process:
// y = A.x
const y = mp.ops.innerProduct(A, x, "ij,j->i");
// y = A.x + b
const y = mp.ops.linear(A, x, b, "ij,j->i");
// B = A transpose
const B = mp.ops.permutate(A, "ij->ji");
// w = u + v
const w = mp.ops.add(u, v);
// operations can be composed, e = (a + b) * (c + d)
const e = mp.ops.hadamard(mp.ops.add(a, b), mp.ops.add(c, d));
// feed-forward block
const y = mp.ops.selu(mp.ops.linear(x, A, b, "i,ij->j"));
y.reverse();
// inspect gradients
if (A.grads()) |grd| {
// use gradient...
}
In the works:
More kernels!
I spent a lot of time working my way through Zig to find an architecture that I felt was a good starting place. At this point, I’m focusing on implementing custom kernels.
Static Library Linkage
– Edited - this was accomplished
Configurable build
– Edited - this has been started
Anyhow, I’ve genuinely learned a lot so far and I’m looking forward to learning more!