The Linux Kernel Is Just A Program

This demystified some aspects of the Linux Kernel init process for me, and helped me understand what an embedded Linux / other OS image might look like. Now I’m curious about what PID 1 is on my system…

11 Likes

This demystified some aspects of the Linux Kernel init process for me,

Writing your own Linux init process is one of those things I think (systems || embedded) programmers should try at least once.

Making it functional enough for any kind of real use is a ton of work, but the basics don’t take a lot of code and teaches you a lot about how things work under the hood.

5 Likes

Very interesting article! As far as I know PID 1 is reserved for systemd/runit. I played around with runit and it won’t start unless PID is 1

1 Like

PID just increments, so PID 1 is just the first process. if a program doesn’t start unless it’s PID 1, that’s the program doing that, not Linux. It is only special in that the kernel looks for and runs an init program when the kernel starts, so that is always PID 1 as no programs are run before it.

I’m not sure why runit would do that, it supports being run by users to be used as a user specific service manager. That being said, I never got that to work, so maybe it does only accept being PID 1.

Since you mentioned runit, if you’re interested in alternatives to systemd, I have settled on dinit, it’s quite simple and just workstm

3 Likes

To be precise, it is the first and the last userspace program to be running on top of the kernel; usually the init system, be it systemd, runit, sysvinit, etc., because that is what starts and orchestrates all the other processes. But nothing really keeps you from running a different program as PID 1. For example, you can just run Bash as your PID 1 (by specifying the init=/bin/bash kernel command line parameter in your bootloader), which gets you a root shell. I have used that successfully to fix my systemd config at various points when I managed to bork it. A lot of stuff we’re used to having does not work when there’s no init system, of course, but you do get a functional shell, and text editors usually don’t require anything more, so that is quite useful.

One caveat is that when PID 1 exits, the kernel panics, because it has nothing more it could logically do, so when you’re fixing things, don’t forget to sync before exiting.

EDIT: Rewords, typos

3 Likes

haven’t heard of it, does some distro use it? I am looking to hop

Dinit
I use it with Artix Linux (arch without systemd, supports various init systems)
it is also supported by eweOS and Chimera Linux.
And you can ofc shove it into any distro if you feel like it.

Chimera is one I’ve kept my eye on as it avoids all the popular and old options. I’ll probably switch to that eventually, unless I can find a decent alternative to nix/guix.

1 Like

Super interesting article! After using Linux for 7+ years I’ learned a good amount from this.

I still think it’s sad that Zig wasn’t in the right place at the right time to become a second language for the kernel. I understand why Rust won, but Zig would’ve been better and more aligned with the ideals of Linus and the main kernel contributors I think.

6 Likes

I think the blog post kinda mixes kernel and init process, which I don’t think is a good thing.

About the kernel part: While technically a kernel is just a program, it’s a program which works quite a lot different compared to traditional program you normally interface with.

To create a hopefully understandable analogy: Imagine you would need to do everything purely through POSIX signals (with the difference that there are a LOT more).
(Yes, I know that there’s also the difference between a trap and an interrupt, but this is an analogy.)

So while in traditional program you work quite sequentially, in a kernel you work event based.

About the init part of the blog post: Doing something like this is also known under the umbrella term Single Application Linux, normally for embedded devices. They sometimes add things like busybox to ease certain things (like mounting a filesystem for logging), but that’s it.

5 Likes

Even if it has been in the right place at the right time, I doubt the kernel programmers would have used Zig, and I wouldn’t have blamed them. Zig isn’t obviously enough better than the current Linux kernel processes to convince people to throw out decades of code and tribal knowledge.

Now, I’m not convinced Rust is either (and some of the fiascos Rust is having replacing coreutils seem to back that up). However, the argument is at least at lot stronger.

1 Like

AFAIK, all the bugs with Uutils are logic bugs, not security bugs. Either way though the GNU coreutils were battletested for 30 years and I’m unsure how many of their CVEs were due to memory corruption. I know that liblouis, a C library for braille back and forward translation, is trying to rewrite their library in Rust because every single CVE they’ve ever had was memory related. Wonder if coreutils is similar?

I think the blog post kinda mixes kernel and init process

I am the author of the post, curious about which part indicates this? Maybe I can polish the wording if it came across that way.

So while in traditional program you work quite sequentially, in a kernel you work event based.

Regular programs can work event based as well, think about async IO, poll, epoll, the whole Go runtime, tokio in Rust, etc. Same as the kernel can run sequentially, especially during the boot process.

3 Likes

It’s more the way the whole text flows from talking about the kernel to talking about the PID 1 process to the init process.

It’s not “this sentence causes that” but “this text structure causes that”.

poll, epoll etc. are not event based in the way signals (or in case of kernels traps and interrupts) are. See the paragraph I wrote before that one to see what I mean. With these structures you provide synchronisation points where you or the event wait if necessary. You can chose where in your code you react to the outside world, but your code still behaves (and is being read by humans) sequentially.

But this has one exception: POSIX aio if you use lio_listio only with LIO_NOWAIT and you don’t use aio_suspend.

1 Like

It’s more the way the whole text flows from talking about the kernel to talking about the PID 1 process to the init process.

The post follows the actual boot sequence. Kernel starts then init starts. That’s how it happens in real life.

Regarding event-based: it doesn’t matter whether event handlers are registered in an interrupt vector table or an epoll loop. The abstraction layer is different, the model is the same, code waiting to react to external events. Programs and kernels both do this. The distinction you’re drawing is implementation detail, not a fundamental difference.

your code still behaves (and is being read by humans) sequentially

Since one processor core can execute one instruction at a time, kernel code is also “read and executed sequentially” by that definition. There’s no special magic.

FWIW, I also got this impression as I was reading through the post for the first time. I tried looking for the particular wording that caused this, and could not find it.

That is interesting (I really mean it - I am curious what could cause this impression), because multiple explicit distinctions were made:

”once our kernel initializes itself, … and hand over control to a program called init”
”Our kernel booted normally, then it started our Go program, the init process.”
”Up until the Run /init as init process line we are in the kernel space. With the init process starting we are entering into the user space.”

I fully read the article for the second time, and did not get the impression at all. So I think my impression the first time around was, most likely, due to my own (incorrect?) expectations: I knew the article was going to talk about the init process, and I interpreted the first explanation about what a kernel is, to be incorrectly explaining what an init process is. I guess?

In short, it was probably me. Sorry for the noise!

2 Likes

Thank you for doublechecking it, this is really helpful.

Ok, to put it into different words:

With an epoll loop (or similar) you decide when you stop processing one event and start processing another. You call a function, figure out based upon what the function gives you what happened, and react.

With a signal/interrupt/trap somebody else decides when you react to a new event for you. It can happen at any time.

As an analogy: With epoll you go to your boss to ask for more tasks. With signals your boss comes to you, tells you to drop what you are doing and gives you a new task.

1 Like

I think the confusion comes from:

The title only talks about the kernel, setting expectations
and the first half is only about the kernel, only to get to the second half being only about init.

I think you just need to clarify there are 2 topics earlier in the post, perhaps in the title as well. init is also just a program.

1 Like