A native non-transformer neural net in zig

Hi, I am just trying to run a neural emulation using the network in zig to make a non-transformer architecture written in zig.

If you want to build from source use the zig binary at master commit 9b177a7d2125 , the commit message for this master point that I selected was Commit subject: Merge pull request 'Rework StackFallbackAllocator' (#31841) into master

A known limitation is the refusal by the model to answer certain questions, SBAN. Disclaimer: THIS IS NOT NEW. All the architecture I used in the zig code is based on already researched science and nothing is an invention, this post does not claim anything completely new and revolutionary.

I am just trying to optimise the model architecture so that when a 100 m byte prediction test is run it doesn’t take the current 3 hours.

Please don’t give a vague answer and if possible could you show possible code changes or at least what files you think are currently holding back my performance.

Thank you for taking your time to read this post. :blush: