Mixing instruction sets on an embedded ARM platform (GBA)

Recently I’ve been using and also contributing to ZigGBA, which is some tooling for making GameBoy Advance homebrew using Zig. This is my first time using Zig! So while I’m picking up a lot, I’m still a novice here.

The GBA is a arm7tdmi compile target, and writing optimized games requires having most code compile to 16-bit Thumb instructions (matching the ROM bus size), and other code - particularly hot loops like memcpy and data decompression code - compile to 32-bit ARM instructions and be run more performantly from the system’s more limited IWRAM (32-bit bus size). (See GBA Hardware - Tonc - GBA Programming in rot13)

However, one thing ZigGBA is currently not set up to do, and which I’ve been challenged to work out on my own, is how to mix these two instruction sets. In fact, currently ZigGBA is stuck on a pre-0.14.0 stable release, because the ROM entry point must be ARM instructions (in ROM, not IWRAM), while it’s desirable for mostly everything else to compile to Thumb instructions, and the .arm inline assembly directive which works as intended in 2024.10.0-mach to insert an ARM entry point within an otherwise Thumb-targeting module is silently ignored as of 0.14.0. (See The `_start` function in `gba.zig` no longer compiles correctly, as of sometime between `2024.10.0-mach` and `0.14.0` compiler releases. · Issue #27 · wendigojaeger/ZigGBA · GitHub)

I can see there’s some related discussion in a Zig GitHub issue (#22285) which makes it seem like it might be possible currently with per-module targets to combine ARM and Thumb code? (PR #18160) Although so far I have not figured out how to get this to work for me with ZigGBA.(Sorry for not giving actual links, but it seems there is a strict 2-link limit as a new user.)

Can someone knowledgeable about using Zig with ARM targets help get me pointed in the right direction on this?