Nix at SCALE
The first-ever NixCon in North America was co-located with SCALE this year. The event drew a mix of experienced Nix users and people new to the project. I attended talks that covered using Nix to build Docker images, upcoming changes to how NixOS performs early booting, and ideas for making the set of services provided in nixpkgs more useful for self hosting. (LWN covered the relationship between Nix, NixOS, and nixpkgs in a recent article.) Near the end of the conference, a collection of Nix contributors gave a "State of the Union" about the growth of the project and highlighting areas of concern.
Docker
Xe Iaso, a "Senior Technophilosopher" at Fly.io gave a talk called "Nix is a better Docker image builder than Docker's image builder". Iaso opened the talk by explaining that "Nix is different from what many developers expect", and requires "a lot of work upfront". Because of that, xe (Iaso uses xe/xer pronouns) is worried that "Nix will wither and die in the industry". Xe said that Docker, on the other hand, has had unparalleled success. It has become the "de-facto package format for the internet", with many large hosting providers (such as Iaso's own employer) offering the ability to run Docker containers as a service. Xe proposed that showing how Nix could fit into existing workflows that use Docker — while delivering better reproducibility and performance — could help drive Nix adoption.
Iaso said that despite its widespread use and adoption, Docker has a fatal flaw: "Docker builds are not deterministic, like not even slightly". Docker image builds frequently have access to the internet, which is needed to download packages for inclusion in the image. Unfortunately, recreating the exact set of inputs to a build is hard, because servers on the internet change all the time, making recreating an image later difficult.
This also has the effect of ensuring that some common software artifacts — such as the GNU C library (glibc) — are duplicated many times across different containers. While it is possible to split glibc into its own Docker "layer" that can be reused between Docker images when it has not changed, most Docker images don't do this. Instead, it is common to first update local package lists through the equivalent of apt update, and then install software directly into another layer. This means that every time the Docker image is rebuilt, the first layer where the package lists are fetched changes, which invalidates the second layer and requires re-downloading and storing another copy of the software that the image uses.
Iaso asked the audience to imagine: "What if your builds didn't need an internet connection, because everything was already downloaded and in the path for you before your build started?" This can be done using Nix, because "Nix lets you know exactly what you depend on ahead of time". Xe then went on to show a demo of a simple Go application, and the steps required to package it in a Docker image using Nix.
nixpkgs has a library called dockerTools — a set of tools to put packages into Docker images in "opinionated ways". Building an image with dockerTools requires little configuration if the project being built is already specified with Nix. The Docker images built in this way do not depend on Nix once built, and can be used like any other image. The library can build Docker images as a single layer, but the preferred method is to put each package in its own layer, which allows sharing layers not only between subsequent builds of the same software, but also between images for different services that have some overlap in their dependencies. Glibc, for example, is a dependency of nearly everything. Putting it in a separate layer allows every Docker image built in this way that depends on that specific version of glibc to share the relevant layer. Sharing layers doesn't just reduce storage costs, it also reduces build times, because Nix can cache image layers the same way it caches other build outputs. "It's magic; it's saved me so much time".
During Q&A, a member of the audience asked if there is "a practical limit to the number of layers". Iaso responded: "Yes, and that limit is 128, and it's dictated by the filesystem drivers that Docker uses." Xe went on to specify that if there are more Nix packages than available Docker layers, the dockerTools library will make the least popular (and therefore least likely to be shared) packages share a layer. Another person asked "What about layer ordering?", referring to the way that a Docker container contains layers in a specific order. Iaso explained that layer ordering is "an illusion" that doesn't actually impact the functionality of the image (unless one layer overwrites a file provided by another layer), and that Nix picks an arbitrary order.
Systemd
Will Fancher gave a talk about the multi-year effort to upgrade NixOS's init system. When a Linux system boots, the bootloader loads a kernel and an initial RAM disk. The kernel unpacks the RAM disk and launches an init process as PID 1 from it. It is PID 1's job to actually set up the system to run by attaching the correct disks, filesystems, loading required kernel modules, and anything else necessary to setting up the fully booted system.
In NixOS, this step is currently done by this shell script that Fancher referred to as "scripted stage one", which encapsulates "the sequence of steps that needs to be taken by just about any installation of NixOS". The problem with this script is that it's serial and imperative. "It's all written in this ... shell-scripty kind of way. [...] It's all a little co-dependent. It's awkward to maintain, and it's frustrating to write." It's also "a lot of custom code" that NixOS can't really directly share with other projects.
Since 2022, Fancher has worked with other members of the Nix community on an alternative. Systemd is already used as PID 1 on a fully booted NixOS system; this new work also makes it part of the initial RAM disk used to set up the system. Using systemd from the beginning has several advantages, including that systemd is declarative, parallel, and that it provides "tools that come straight out of the systemd project, that we don't have to develop and maintain ourselves". These tools include niceties such as rescue and debug shells, systemd-networkd for configuring networking in a more robust way, systemd-ask-password to unify the interface for letting services request passwords on boot, and systemd-cryptsetup for supporting hardware keys for disk encryption.
Fancher gave an example from his own work where a server had a complicated tangle of different encrypted disks and networking services that were required to boot properly. He said that systemd lets you specify these kinds of complicated setups in an elegant way. He also pointed out that systemd's parallelism might speed up the early boot by preparing multiple disks simultaneously, although since this phase of the boot process is "only a few seconds for most systems", the impact won't be large.
The work to bring systemd to NixOS's early boot is quite far along. The option to enable the new mode was stabilized toward the end of last year in version 23.11, and Fancher hopes for it to become the default in version 24.05, which is expected in May. He said that there were still minor incompatibilities, but that those are detected at build time and a warning is issued while falling back to the existing code. He called out bcachefs, in particular, as something that does not yet work with the new setup, saying: "The reason it works in scripted stage one at all right now is basically luck." NixOS users wishing to try the new stage one implementation early can set boot.initrd.systemd.enable in their configurations.
Once the new mode becomes the default, version 24.11 (expected in November) will remove the networking code in the existing scripted stage one, which Fancher called out as particularly "janky". Assuming this goes as planned, version 25.05 will remove scripted stage one entirely. At the end of his talk, Fancher said that the work had involved a lot of collaboration, and said that a takeaway should be: "If there's a little thing that you want to contribute, then you should go ahead and contribute."
Module contracts
Pierre Penninckx gave a more theoretical talk calling for the Nix community to standardize a set of conventions for configuring services. Standing up a service that has been packaged for NixOS is easy; Penninckx showed how to run Nextcloud on a NixOS server:
services.nextcloud.enable = true;
"Since the moment I saw that, I thought 'I need to use Nix for everything now'."
This ease comes at a cost, however. All of the other services that a particular service depends on are hard-coded by the maintainer of the NixOS module. For example, Nextcloud depends on NGINX, PostgreSQL, and a handful of other services. A user wishing to substitute Apache as the reverse proxy in front of Nextcloud would be out of luck unless they forked the NixOS module and wrote their own definition. This also means that there's no good way to reuse the configuration of smaller components between different modules.
To remedy this, Penninckx suggested having explicit contracts between modules. NixOS modules communicate with each other via configuration options in a central attribute set (Nix's equivalent of a dictionary). Explicit contracts between modules would be sets of configuration options that all the modules of a given type would agree to support — both for receiving and setting.
For example, two different database systems would support the same set of options for specifying that a service needs its own database user, and would produce the same set of options describing how to connect to the database in response. Each system could still have its own extra configuration options to cover things unique to that system, but they would support a common core of operations. This would make it easy to swap out two different equivalent services, giving more choice to the end user. "That's to me very powerful".
Penninckx pointed out that there are already a lot of implicit contracts between modules. For example, almost every service in NixOS supports a boolean enable option, to the point where users often don't even need to check the docs to know that the option will exist. Everybody expects enable to work in the same way between services, even though that's just an undocumented convention. There are several examples of patterns like that in nixpkgs that he pointed out.
Penninckx called on the community to spend some time planning how to formalize these contracts. He recognized that would take work, but thought that the effort would be worth it, and pointed out that the Nix community has a track record of pulling off long-term plans, such as the stage one changes from Fancher's previous talk. During Q&A, one member of the audience pointed out that the Kubernetes project already has a similar kind of abstraction, and suggested that perhaps the Nix project could coopt some of that design. Penninckx responded that "Yes, we could probably learn a trick or two from there".
The working examples that Penninckx has put together for his own use live in the Self Host Blocks project, which includes examples of how to set up contracts with automatic integration tests to make sure that the options have consistent effects across different services of the same type. It also includes "building blocks" for the services he uses — a collection he hopes to expand over time. He invited people who were interested in the topic to join the Matrix channel dedicated to the project to discuss what the next steps are.
State of the union
Toward the end of NixCon, a chaotic mix of presenters including Eelco Dolstra, original creator of Nix, and Ron Efroni, CEO of Flox and longtime Nix contributor, gave a Nix "State of the Union" talk for 2024. They went into how the Nix project has grown, what challenges the project is currently facing, and what people can do to contribute.
The general outlook was positive, with the community growing in every metric, including contributors, donations, and active users. However, there is one source of concern, which is whether Nix's funding can keep up. "This technology is not yet fully self-sustaining", Efroni warned. Much of Nix's funding comes from NLnet, but large amounts of support are also provided by various companies and individual donations.
LogicBlox, which is no longer an independent company, had provided support for NixOS since 2010 in the form of paying for the storage costs of NixOS's build-cache infrastructure. That cache has grown to 399.3TB of infrequently accessed data, and 111.5TB of recently accessed data comprising 793 million objects. In 2023, LogicBlox decided that it was unable to keep up with the growing demands of the cache, and that it could no longer continue its sponsorship. The Nix community pulled together and found alternative funding sources over the course of two weeks, but the incident was something of a wakeup call for the Nix foundation that Nix needs additional funding to be sustainable in the long term. "The goal of the foundation is to create a Nix that is sustainable," Efroni explained.
Dolstra took over the presentation at that point to explain the technical measures that the project plans to take to reduce the costs of the cache, including garbage collecting build artifacts of releases made more than two years ago. "The cache cannot keep growing forever, probably".
He clarified that they would keep the final version of each release, to make checking out historical versions of nixpkgs easier. He also stated that they would not be removing cached source code, so that rebuilding old versions of packages would remain possible. "The sources will still be there", Dolstra promised.
Finally, a succession of contributors took the stage to talk about all the different teams within the Nix project that are necessary to make it a success, including build infrastructure, architecture, moderation, marketing, and more. They all encouraged people to get involved by joining the Nix Discourse and Matrix space.
Overall, NixCon ran smoothly and drew dozens of people excited about the future of the project. The talks are currently available in the form of two recorded SCALE livestreams (stream 1, stream 2), but the SCALE YouTube channel will eventually have all the talks separated out and posted individually.
[I would like to thank LWN's travel sponsor, the Linux Foundation, for travel assistance to Pasadena for SCALE.]
Index entries for this article | |
---|---|
Conference | Southern California Linux Expo/2024 |
Posted Mar 25, 2024 20:50 UTC (Mon)
by gmgod (subscriber, #143864)
[Link] (14 responses)
I'm personally especially excited by the systemd integration. The startup script will never achieve the features systemd has accumulated with the years.
That being said... Talking about "Nix at scale", what's the point using an abstraction you know does not scale before falling back on something ugly? (Talking about overlayfs' limit) I'd wager that that limit is there to prevent performance pitfalls.
It sounds like ostree/composefs would be better at doing that... And it can be used in a reproducible way (well ostree at least) from within docker/podman/buildah build... In short, overlayfs is probably not the right abstraction to pursue there.
Sounds to me like "so turns out people like knives and we can use one to put down screws it turns out but it does not scale and it has a few downsides". I'd just say, please use a screw driver, even if said scredriver is embedded in a knife.
Posted Mar 25, 2024 21:05 UTC (Mon)
by gmgod (subscriber, #143864)
[Link] (10 responses)
That felt odd compared to the usual articles and is not written particularly clearly TBH. If you want to give more coverage to personal aspects, I'd simply suggest to put them in their own articles and keep technical stuff technical. :)
Posted Mar 25, 2024 22:07 UTC (Mon)
by mjg59 (subscriber, #23239)
[Link]
Posted Mar 25, 2024 23:35 UTC (Mon)
by herrwiese (subscriber, #92825)
[Link] (4 responses)
So yes, you might find this exxageratively sociologic for a technical article, but technical writing isn't detached from social facts.
Posted Mar 26, 2024 13:26 UTC (Tue)
by makendo (guest, #168314)
[Link] (3 responses)
Posted Mar 26, 2024 14:26 UTC (Tue)
by willy (subscriber, #9762)
[Link] (2 responses)
Posted Mar 26, 2024 15:22 UTC (Tue)
by paulj (subscriber, #341)
[Link] (1 responses)
E.g., English does already have a gender-neutral singular pronoun, with many hundreds of years of use. It should always be considered acceptable as a fall-back. (That shouldn't be controversial, but apparently there are people who get triggered over that).
Posted Mar 26, 2024 16:32 UTC (Tue)
by Wol (subscriber, #4433)
[Link]
In real life, I get rather upset - sometimes very vocally - when people get my name wrong, but I draw the distinction between "they didn't know/realise" and "deliberate / reckless". The guy who asked me my name over the phone, and then DIDN'T USE the name I told him ... I couldn't get off the phone quick enough. In a GMeet call, my name is on the screen - if you don't know how to address me - use that! If I don't like it, it's my fault for not getting it changed. But if you use the name on screen, and it's wrong, it's clearly completely innocent on their part.
Cheers,
Posted Mar 25, 2024 23:59 UTC (Mon)
by flussence (guest, #85566)
[Link] (2 responses)
Posted Mar 26, 2024 0:47 UTC (Tue)
by gmgod (subscriber, #143864)
[Link] (1 responses)
I'm pretty happy, over here, to *not* constantly hear about Hans Reiser being in prison every time Reiserfs is mentioned, just like I'm glad "anger issues" are not mentioned when Linus is being talked about or who shagged who at the last conference.
I find it pretty ironic that someone who clearly read too much into what I wrote, tries to defend respect and open-mindedness by shitting on them. Please have a good look into the mirror next time you qualify something you imagine does not agree with you "filth".
Posted Mar 26, 2024 1:20 UTC (Tue)
by jake (editor, #205)
[Link]
Let's please move on ...
jake
Posted Mar 26, 2024 2:59 UTC (Tue)
by titaniumtown (subscriber, #163761)
[Link]
Posted Mar 26, 2024 0:46 UTC (Tue)
by geofft (subscriber, #59789)
[Link] (2 responses)
Posted Mar 26, 2024 1:31 UTC (Tue)
by gmgod (subscriber, #143864)
[Link] (1 responses)
I'm not sure if the limit is arbitrary or if there is a reason for it. All I know is that each layer is a level of indirection before files are actually opened. With lots of files and lots of layers, which was my use case, latencies were horrendous. Once the file is opened, it's fine... Native speed, really. Until the file is closed and the kernel forgot about it. Then you get latency spikes again.
From what I understood, the proposal here is to stack packages, one per layer, so that docker can be leveraged to produce composable images with nix. When the limit is reached (please have a look on your system for the number of packages you use to put this into perspective), the workaround is to use the last layer as a misc one, containing the rare packages.
It feels like shoehorning something that works pretty well (nix) into something that might be OK if we had MAX_UI64 layers (even then, I'm genuinely not sure, not because of size but because of indirection) when there are other ways to solve the original "kind-of" problem (i.e. how to shoehorn Nix in the Docker world, if possible while retaining reproducibility?). I just say "kind of" because I'm not sure Docker's way of distribution was the best for starters and because it's a shoehorning exercise: Nix works very well by itself.
Finally, if the matter is to provide Nix in containers in a repeatable and composable way, layers are not the only tools we have...
Posted Mar 26, 2024 7:37 UTC (Tue)
by pmarquesmota (subscriber, #156137)
[Link]
Posted Mar 26, 2024 15:04 UTC (Tue)
by bjackman (subscriber, #109548)
[Link] (1 responses)
I also put "Docker image" in quotes because all the other container runtimes are also better than Docker at solving their use case, and between them they seem to cover all the use cases. And container images are not runtime-specific these days.
Posted Mar 26, 2024 15:05 UTC (Tue)
by bjackman (subscriber, #109548)
[Link]
Posted Mar 26, 2024 15:13 UTC (Tue)
by pj (subscriber, #4506)
[Link] (2 responses)
Posted Mar 26, 2024 16:06 UTC (Tue)
by spacefrogg (subscriber, #119608)
[Link]
Posted Mar 29, 2024 17:58 UTC (Fri)
by bauermann (subscriber, #37575)
[Link]
This becomes apparent in this article's section about module contracts. In Guix, the problem described in the article becomes a simpler problem of refactoring code to make the different services more composable. Perhaps even creating a function that accepts packages as parameters and returns a service definition that uses those packages in the service instead of the default ones.
Though admittedly it does have a high learning curve, especially if one isn't used to Lisp.
Nix at SCALE
Nix at SCALE
Nix at SCALE
Nix at SCALE
Nix at SCALE
Nix at SCALE
Nix at SCALE
Nix at SCALE
Wol
Nix at SCALE
Nix at SCALE
Let's not go down this path
Nix at SCALE
Nix at SCALE
Nix at SCALE
It's line 73 & 81Nix at SCALE
Happy hacking!
Nix at SCALE
Nix at SCALE
Nix at SCALE
Nix at SCALE
Nix language vs Scheme