Recently, I have been thinking about operating systems that aim to help people learn about kernel development. Two examples are PintOS from Stanford and xv6 from MIT. Both are written in C. Feel free to comment with others.
I’m brainstorming development of an OS in the style of PintOS, the one I am familiar with having built its Projects 1-4. While both PintOS and xv6 are solid, I feel PintOS is dated and xv6 implements too much for learners. With modern architectures, build systems, C language standards, tooling, and OS research I think it’s possible to build an OS offering a first-class and rigorous developer experience.
I was preparing to start this project with C, but I’m really liking Zig and would like to know if I should change course (aka continue procrastinating).
Here are the reasons I think Zig might be a good choice instead.
The build system is excellent. I would hopefully be able to avoid complicated Makefiles or CMake while managing compilation targets in a sane way.
I believe the compatibility with C would allow me to develop most components in Zig. Learners could then choose to implement their required components in C or Zig, depending on their preference. Because C is the common language of systems programming, I wouldn’t want learning how to write Zig to be a blocker for learning OS development. For any Zig interfaces that might touch C code, I would need to carefully design them for use with either Zig or C; my current understanding is that this is possible, right?
The Zig build system can generate documentation from source code. This could be a huge time saver. Because this would be an educational OS, the source would likely be more thoroughly documented than most code bases. The “textbook” would be baked into file-level and doc comments exactly once with no need for supplemental repetitive docs. Hopefully, this would make the learning process easier for users and they could complete the “textbook” by completing and commenting their implementations (assuming they use Zig).
Zig has good potential, from my perspective. Having an OS that purposely allows mixing of C and Zig could help people consider how to care for existing C code bases or how to move code bases to Zig.
Here are my major concerns and questions.
Tooling. This is the big one. I have come to greatly appreciate the static analysis and runtime safety tools available for C. I use clang tidy, gcc’s -fanalyzer, and sanitizers all the time. If the project were in C, I would make every effort to have these, or similar tools, built in from the start to improve the developer experience. How do you think Zig compares on this point?
Debugging. In some simple experiments, I have no trouble stepping through code and examining variables in both Zig and C functions in GDB. However, I have not tested this extensively and don’t know if there are further considerations for debugging a mixed language code base.
I would appreciate recommendations or thoughts on this topic and how Zig might fit the needs of the project. I have seen some OS projects in Zig that aim for a 100% Zig implementation or aim for conversion from existing C to Zig, but none that intentionally leave gaps for either language to fill in the OS functionality. Feel free to point out any of my misunderstandings or pros and cons I completely miss.
Regarding tooling, zig can work with existing c tooling eg valgrind, tooling native to zig is also being developed both part of the compiler and 3rd party at various but probably not complete stages.
Even if you don’t use the language at all, zig provides a wrapper over clang that enables various warnings and sanitisers by default and provides great cross compilation.
Debugging shouldn’t be a problem, zig provides pretty printers for gdb and lldb, there is also a fork of lldb which is required if you want to debug when using zig’s custom backend’s (only x86_64 atm, but arm is almost there)
Good points. Perhaps using Zig as the build system early on, writing in C, then transitioning to Zig as the tool-chain develops could be good. However, none of PintOS, xv6, or even the Linux kernel are set up with extensive static and dynamic analysis built in, so it’s not a huge loss. However, I think I should try my best to improve this side of things for the users when developing the project.
Yes, many projects that export a C lib (ghostty, bytebox, etc.) have a file where they set up the exports specifically for the C api. These will use the Zig api but provide the types and wrappers for using the Zig code.
Note: You will have to write the header files for these by hand. In the future, the compiler will be able to emit these, but that is currently broken
Zig by default will compile C libraries with things like undefined behavior sanitizing turned on. These are equivalent to the Clang options and other sanitizers are available (but I’m unfamiliar with which ones are turned on by default). This handles a lot of the runtime safety.
Zig code will have a lot of these behaviors as well and in the future the compile will add more safety as the compiler devs implement them. Things like returning a pointer to stack memory, etc. are in the issue tracker.
I haven’t had any issue stepping through my mixed codebase for Windchime. I use the dbus shared libraries and statically link the miniaudio library as well. I can step through both the Zig application code and the miniaudio fine. Dbus doesn’t have all the symbols since it is a system shared library, so I can’t step through that well, but if there is a crash in the dbus part, I can still get a backtrace and call stack info.
Granted, I’m still new to debuggers, so some of the advance usage stuff I’m more ignorant on.
I feel like I have to advertise my “Zig OS” (sfiedler/zig_os: A combination of a kernel and a bootloader to show newcomers to OS development how to get the "start" done. - Codeberg.org) which implements a very basic x86_64 UEFI bootloader and kernel (leaving much room for people to implement their own stuff), where the goal is to explain every line as much as possible.
On some platforms (like riscv), debug info on freestanding doesn’t really work, which forces you to do llvm-dwarfdump --lookup ADDR KERNEL_ELF if you want to process your stack traces you got from a panic.
And, for example, valgrind “hooks” into your program (using some OS stuff that is already implemented, like LD_PRELOAD; which isn’t available on freestanding). Things like sanitizers often are preconfigured to print their output somewhere, which isn’t always possible in embedded or OS programming (but Linux also doesn’t do it).
What is great though is that you can define your own test runner and so can make a small version of your OS that runs the tests directly on hardware.
Debugging works good with Zig, QEMU and GDB (see Kernel Debugging - OSDev Wiki), but I mostly use printf debugging (which is very common in OS development space, I think). And with defining your own logFn in std_options, you can do a lot of printing via std.log.
That’s my experience with developing an OS. And I think, Zig as first OS language is better than C because it already protects you from some footguns (not all, but many).
In my limited experience of implementing “Operating System in 1000 lines of code”, I think that using Zig as a language, not only as the build system, has a good list of upsides.
The aforementioned book uses C code which I converted to Zig and every single instance of it turned out simpler and clearer than the original.
Here’s the repo with code and video recordings of the experience:
If I had to name just one feature, packed structs are significantly clearer to understand and less error prone than doing bit fiddling with a bunch of unclear int promotion rules in C.