initramfs and where user space truly begins

[Posted July 11, 2006 by corbet]

The initramfs mechanism was added to the 2.5.46 kernel. With initramfs, a boot-time filesystem can be created (in cpio format) and appended to the kernel image file. When the system boots, it will have access to the filesystem from the very beginning of the bootstrap process - far before it reaches the point of being able to mount disks. Initramfs works much like the venerable initrd facility, but, unlike initrd, initramfs does not require the system to be able to mount a disk and find the filesystem image.

Initramfs is increasingly useful as hardware becomes more complex. Often, simply finding the root filesystem can involve complex hardware setup, conversations across the network, getting cryptographic keys, piecing together RAID or LVM volumes, and more. Currently, much of this work is done inside the kernel itself, leading to kernel code which duplicates user-space tools - but with less review and maintenance. Moving this work into a user-space boot-time filesystem promises to shrink the kernel, make the boot process more reliable, and allow distributors (and users) to customize the early bootstrap process in interesting ways.

Thus far, however, use of initramfs has been limited; in particular, all of the early boot code remains in the kernel. One of the blocking points has been the need for a minimal C library which would work in that environment. This library (klibc) has been under development, slowly, for years. That work has recently culminated in a set of klibc patches posted by H. Peter Anvin. Klibc is now in a position to help rework the Linux bootstrap process - and to force discussion of just how the kernel should interact with tightly-coupled utilities.

The core klibc patch includes replacements for a long list of C library functions and system call wrappers. It is sufficient, for example, to support a minimal shell called "dash" and a port of the gzip utility. There is a root filesystem mounting utility which can handle several filesystem types, obtaining an IP address using bootp or DHCP, NFS mounts, assembly of RAID volumes, resuming of suspended systems, and more. Much of the code which performs those functions can then be removed from the kernel itself. Klibc and the kinit program which comes with it appear to be getting close to ready for real use.

This code, like other efforts to move core kernel features into user space, raises a number of questions. Some of these are likely to come up at the kernel summit in Ottawa, but a real solution is likely to be rather longer in coming.

The fundamental question is this: are klibc and kinit part of the kernel? They consist of code which used to be part of the kernel itself, and which is a necessary part of the kernel bootstrap process - if the related code is removed from the kernel, the kernel will not be able to run without kinit. Both components are tightly tied to the kernel, to the point that a kernel upgrade may often require upgrading kinit and klibc as well. A system where the kernel and kinit go out of sync may well fail to boot.

To many developers, these reasons are more than adequate to justify packaging (and building) kinit and klibc with the kernel itself. If the code is kept and built together, it has a much higher chance of continuing to function as a coherent whole. Every kernel/kinit combination will have been tested together and will be known to work. If, instead, the two are separated, the resulting kinit will be, in essence, a large body of kernel code which is not reviewed and maintained with the rest of the system. The quality of kinit could be expected to suffer, complaints from users could grow, and differences between distributions could increase.

On the other hand, if kinit must be part of the kernel, one could well ask just where the line should be drawn. Should udev, which has suffered from (rare) kernel version incompatibilities, be included? How about the user-space software suspend code? Cluster membership utilities? Filesystem checkers? Wireless network authentication daemons? Unless Linux is going to head toward a more BSD-like organization (an unlikely prospect), we will not see all of the above tools included in the kernel tarball anytime soon. And so, according to some, kinit and klibc should be maintained as out-of-kernel packages like any other user-space code.

There is another important issue here, however: compatibility between distributions and between kernel versions. Earlier this year, your editor had a system running a development distribution fail to boot; that distribution's maintainers had concluded that, since the distribution-specific initrd image mounted /proc and /sys, there was no reason for the initialization scripts to do so as well. Your editor, who has never had much use for initrd, was left with a system which was unable to run a vanilla kernel.org kernel. That particular change was (after your editor complained) backed out, but the issue remains: distribution-specific initialization code can make it impossible to run kernels obtained from elsewhere. Ted Ts'o has also pointed out an initialization problem which makes RHEL4 unable to run current kernels on some systems. He says:

Kinit SHOULD be merged into the kernel, and the responsibility of creating the initrd/initramfs image should be moved from the distribution into the kernel build process. There can and should be a way for distro's to add their own "value add specials" into the initrd/initramfs image, but we have to take over creating the base initial userspace environment.

This is a discussion which could go on for some time; it could become one of the more contentious issues at the kernel summit. There is a subset of the kernel development community which has a strong desire to move as much code as possible into user space. Not everybody agrees that this is the right approach, but, to the extent that code is shoved out of kernel space, there must be a vision describing how all of the pieces will continue to work well together into the future. That vision does not yet appear to exist.

Index entries for this article
Kernel	Bootstrap process
Kernel	initramfs
Kernel	klibc

How much API should the kernel export?

Posted Jul 13, 2006 3:18 UTC (Thu) by felixfix (subscriber, #242) [Link] (1 responses)

That's what it comes down to. It has always exported a system call interface. Way back when, kernels didn't have these new fangled /proc and /sys interfaces, but they always had /dev.

An extreme case of exportitis is a micro kernel which exports everything and has almost no internal code.

Somewhere in between is the holy grail (to some) and mark of the beast (to others) of an interface stable enough to allow binary modules whose lifetimes span an entire kernel major series.

I wouldn't mind these kind of stable interfaces IFF they came with the understanding that anything using them would run several times slower and have fewer features than a native driver. But my experience has been that those who would agree with that today would complain tomorrow of being second class kernel outsiders. They would point fingers at every subsequent minor release which increased the incompatibility and made the universal drivers ever more distant and klugey, and be a real drag on development, trying to hold back good changes for their own selfish interests.

I wonder how long before this particular development, and others to follow, take that same path.

How much API should the kernel export?

Posted Jul 13, 2006 4:40 UTC (Thu) by dlang (guest, #313) [Link]

nobody is suggesting that the interface would be stable with kinit and klibc.

in fact that's one of the reasons they say they should be included with the kernel, specificly so that they adapt immediatly to API changes. this is not a ABI proposal by any means.

initramfs and where user space truly begins

Posted Jul 13, 2006 4:55 UTC (Thu) by dlang (guest, #313) [Link] (5 responses)

I started playing with linux around the 0.99 days and have been makeing my living with it since the 2.0 days, when I started you _had_ to compile your own kernels. I still do for ease of maintinance and performance reasons (and yes, I also drive a stick-shift, I like having control :-)

during the last 10 years of makeing my living with Linux the only time I have used initrd or initramfs is when booting a new distro (useually only long enough to download the kernel source and recompile) I don't like the extra step of updateing the boot filesystem and matching the right one to the right kernel. As such I am among those who have been nervous at the claims that everything that can move out of the kernel must do so.

however with initrd and klibc/kinit it can be possible to have the straightforward make menuconfig && make && make install process produce a single object that is enough to boot the system (satisfying my desires) while still splitting functions out of the kernel itself into userspace. as long as this is available it really doesn't matter much to me what moves where.

I would say that the line of what should be part of the kernel tree and what shouldn't needs to be based on what is needed to function and drive the hardware. As such udev and alsalib should probably be included. software suspend code may belong there as well (there are only a small number of ways to do the job, and they tie in fairly tightly with the kernel itself, besides the debate over if it should be kernelspace or userspace to begin with), but cluster membership works quite well seperatly (and you have quite a few different options to choose from) with the other things being even further out.

alsa is a particularly good item to look at, it's half in the kernel and have in a userspace library, but the API that everyone is supposed to use is the library, not the kernel. As such it could be argued that the kernel API's really aren't relevant and the library should be packaged with the kernel (it's not today becouse of the kernelspace and userspace dividing line, but maby it should be)

David Lang

initramfs and where user space truly begins

Posted Jul 13, 2006 10:53 UTC (Thu) by nix (subscriber, #2304) [Link] (4 responses)

With initramfs you can do all of that too: in fact the initramfs build process is much *easier* for the builder than the initrd ever was, because the build system can put together the cpio archive for you and compress it.

Plus, there's *no* danger of finding that you've managed to lose the initrd that corresponds to some kernel, and now you can't boot it anymore, or finding that your initrd has changed but your kernel hasn't (perhaps you had one initrd in use by several kernel images) and now you can't boot it either.

And anything that zaps pivot_root(2) and the other mass of wildly variable and variously bizarre historic horrors that initrd has accumulated to switch to the real root *has* to be good. A tiny C program to close all fds, rm -rf /-on-one-filesystem, chroot(), and execve() is all you need to use to switch from initramfs. :)

initramfs and where user space truly begins

Posted Jul 13, 2006 15:44 UTC (Thu) by dlang (guest, #313) [Link] (3 responses)

however there's still the need (currently) to prepare the initramfs manually before building the kernel.

while it's optional this isn't a problem (I just ignore the option entirely), but if/when it's made mandatory this seperate manual step should be automated.

initramfs and where user space truly begins

Posted Jul 13, 2006 16:25 UTC (Thu) by nix (subscriber, #2304) [Link] (2 responses)

There's no need to do that, unless by 'prepare' you mean 'tell the kernel build infrastructure which files should go into initramfs'. I can see no way to automate *that* without eliminating all the configurability (of course there should be a default that uses kinit if kinit becomes mandatory, and the kinit patches do indeed provide such a default).

initramfs and where user space truly begins

Posted Jul 13, 2006 16:40 UTC (Thu) by dlang (guest, #313) [Link] (1 responses)

any portions that the kernel requires (kinit for example) need to be pulled in automagicly.

this could be as simple as having a directory under the source three /initramfs such that anything that's in there gets used to create the initramfs (and the kernel compiles kinit and any other required pieces and puts them in there)

or any other method of makeing a default initramfs that provides hooks so that the distros can add their own stuff in.

the point I'm looking for is that today you can make a monolithic kernel by make *config && make and then use the resulting file on any compatable machine and it's sufficiant to boot the machine. if initramfs is made mandatory then it needs to be equally simple to manage.

initramfs and where user space truly begins

Posted Jul 14, 2006 11:55 UTC (Fri) by nix (subscriber, #2304) [Link]

It already is: there is a default initramfs source file which contains everything needed for kinit; you can add stuff to it as you wish.

initramfs and where user space truly begins

Posted Jul 13, 2006 5:07 UTC (Thu) by drag (guest, #31333) [Link]

As far as I am concerned initramfs kicks-ass.

It's finally a initrd thing I can understand. I can add my own scripts and do my own thing and it's not a huge deal.

For instance I wanted to have my flash drive show up as /dev/flashdrive and it's partitions as /dev/flashdrive1 and /dev/flashdrive2.

This is because on different machines it would show up differently. Some machines had it as /dev/sda1 and /dev/sda2 and others with SATA drives would have it show up as /dev/sdb or c or d. This was very important problem becuase I installed Debian on flashdrive for booting machines up so I could carry around a linux system with me for surfing the web or doing rescue stuff or whatever.

I tried initially to reference the thing using it's volume label, but that was not a total solution. It didn't work always as the /dev/disk stuff just was a symbolic link to /dev/sd*. Also I didn't want it to change the order of how the drives was detected. If the harddrives showed up as /dev/sda I wanted to make sure that they stayed that way.

editing the initramfs scripts in my own inept way made it simple for me to work around the various small issues that cropped up when trying to have a single root file system and initrd image boot up every computer that I happenned to come across.

So next thing I am going to do is use squashfs and UnionFS and some custom scripts to make it so that I can compress the majority of the root file system to reclaim disk space and increase speed and yet keep it read-write.

initramfs and where user space truly begins

Posted Jul 13, 2006 10:49 UTC (Thu) by nix (subscriber, #2304) [Link] (5 responses)

Yes. The kernel devs are in a bit of a bind.

If they leave initramfs as it is now, completely replaceable by the builder, then the builder's existing initramfs setup will continue to work: but nothing new can be moved out of the kernel into early userspace without requiring the builder to update that setup.

If they switch over initramfs so that the user can add things to an existing klibc-based system, they allow migration of extra init work from the kernel, and shoot a lot of existing users in the foot (e.g. those of us with busybox+uClibc-based initramfses are in trouble, because busybox won't build with klibc; there are a lot of other programs that won't either; will e2fsprogs's fsck work when linked against klibc? What about mdadm?)

(And use of initramfs is common not just by distro kernels but also by those of us who keep our root filesystems in LVM on MD, so as to get a combination of LVM expandability and RAID robustness, let alone anyone who uses an encrypted root filesystem on a network block device or anything elaborate like that, as you said. I know I had my root filesystem on a network block device for a few weeks solely to let me keep running while I recovered from a major disk failure: that's what pushed me to RAID in the first place).

initramfs and where user space truly begins

Posted Jul 13, 2006 15:49 UTC (Thu) by dlang (guest, #313) [Link] (2 responses)

nothing says that your initramfs can't have some programs with klibc and some whilt glibc (although that does waste some space)

they are working on getting the programs that you mention to run with klibc so it's a temporary problem (and one that will be easier to fix once klibc is included with the kernel, which will ease the maintinance burden that's involved with tracking kernel changes, allowing for more time to be spent on any changes to klibc that need to be done)

initramfs and where user space truly begins

Posted Jul 13, 2006 16:26 UTC (Thu) by nix (subscriber, #2304) [Link] (1 responses)

IIRC Rob Landley was using words like 'no chance' regarding getting large parts of busybox to work (please correct me if I'm talking nonsense, Rob, my memory is hazy right now due to insomnia).

initramfs and where user space truly begins

Posted Jul 14, 2006 5:09 UTC (Fri) by dlang (guest, #313) [Link]

not all the work needs to be done by Rob or the busybox developers

remember that maintaining a large patch out-of-kernel is a significant drain on a projects resources, once it can move into the kernel that drain is stopped and the time can be spent on other things, including (in this case) plugging the holes tha prevent it from working with more apps (to a large degree anyway, they don't want to have to support every function call forever)

initramfs and where user space truly begins

Posted Jul 13, 2006 15:51 UTC (Thu) by cventers (guest, #31465) [Link] (1 responses)

Well, unless klibc implements things uClibc doesn't, why couldn't a user
using busybox and uClibc with initramfs not continue to use uClibc and
build the kinit stuff against it?

I agree that it's a tough call but I think it's an exciting, neat and
clean idea to move more of that boot policy out of the kernel. There's
nothing more irritating to me than watching the kernel panic because the
VFS can't mount root, and then having to juggle boot CDs to go in and fix
it. Having dash available right then to step in would be convenient :)

initramfs and where user space truly begins

Posted Jul 13, 2006 16:27 UTC (Thu) by nix (subscriber, #2304) [Link]

Indeed this seems ideal :) it could be done easily by simply allowing the kinit toolchain to differ from that used for everything else (so you could put your uClibc toolchain in there instead).

(I can't recall if this is already done.)