[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
|
|
Subscribe / Log in / New account

Nix at SCALE

By Daroc Alden
March 25, 2024
SCALE

The first-ever NixCon in North America was co-located with SCALE this year. The event drew a mix of experienced Nix users and people new to the project. I attended talks that covered using Nix to build Docker images, upcoming changes to how NixOS performs early booting, and ideas for making the set of services provided in nixpkgs more useful for self hosting. (LWN covered the relationship between Nix, NixOS, and nixpkgs in a recent article.) Near the end of the conference, a collection of Nix contributors gave a "State of the Union" about the growth of the project and highlighting areas of concern.

Docker

Xe Iaso, a "Senior Technophilosopher" at Fly.io gave a talk called "Nix is a better Docker image builder than Docker's image builder". Iaso opened the talk by explaining that "Nix is different from what many developers expect", and requires "a lot of work upfront". Because of that, xe (Iaso uses xe/xer pronouns) is worried that "Nix will wither and die in the industry". Xe said that Docker, on the other hand, has had unparalleled success. It has become the "de-facto package format for the internet", with many large hosting providers (such as Iaso's own employer) offering the ability to run Docker containers as a service. Xe proposed that showing how Nix could fit into existing workflows that use Docker — while delivering better reproducibility and performance — could help drive Nix adoption.

[Xe Iaso]

Iaso said that despite its widespread use and adoption, Docker has a fatal flaw: "Docker builds are not deterministic, like not even slightly". Docker image builds frequently have access to the internet, which is needed to download packages for inclusion in the image. Unfortunately, recreating the exact set of inputs to a build is hard, because servers on the internet change all the time, making recreating an image later difficult.

This also has the effect of ensuring that some common software artifacts — such as the GNU C library (glibc) — are duplicated many times across different containers. While it is possible to split glibc into its own Docker "layer" that can be reused between Docker images when it has not changed, most Docker images don't do this. Instead, it is common to first update local package lists through the equivalent of apt update, and then install software directly into another layer. This means that every time the Docker image is rebuilt, the first layer where the package lists are fetched changes, which invalidates the second layer and requires re-downloading and storing another copy of the software that the image uses.

Iaso asked the audience to imagine: "What if your builds didn't need an internet connection, because everything was already downloaded and in the path for you before your build started?" This can be done using Nix, because "Nix lets you know exactly what you depend on ahead of time". Xe then went on to show a demo of a simple Go application, and the steps required to package it in a Docker image using Nix.

nixpkgs has a library called dockerTools — a set of tools to put packages into Docker images in "opinionated ways". Building an image with dockerTools requires little configuration if the project being built is already specified with Nix. The Docker images built in this way do not depend on Nix once built, and can be used like any other image. The library can build Docker images as a single layer, but the preferred method is to put each package in its own layer, which allows sharing layers not only between subsequent builds of the same software, but also between images for different services that have some overlap in their dependencies. Glibc, for example, is a dependency of nearly everything. Putting it in a separate layer allows every Docker image built in this way that depends on that specific version of glibc to share the relevant layer. Sharing layers doesn't just reduce storage costs, it also reduces build times, because Nix can cache image layers the same way it caches other build outputs. "It's magic; it's saved me so much time".

During Q&A, a member of the audience asked if there is "a practical limit to the number of layers". Iaso responded: "Yes, and that limit is 128, and it's dictated by the filesystem drivers that Docker uses." Xe went on to specify that if there are more Nix packages than available Docker layers, the dockerTools library will make the least popular (and therefore least likely to be shared) packages share a layer. Another person asked "What about layer ordering?", referring to the way that a Docker container contains layers in a specific order. Iaso explained that layer ordering is "an illusion" that doesn't actually impact the functionality of the image (unless one layer overwrites a file provided by another layer), and that Nix picks an arbitrary order.

Systemd

Will Fancher gave a talk about the multi-year effort to upgrade NixOS's init system. When a Linux system boots, the bootloader loads a kernel and an initial RAM disk. The kernel unpacks the RAM disk and launches an init process as PID 1 from it. It is PID 1's job to actually set up the system to run by attaching the correct disks, filesystems, loading required kernel modules, and anything else necessary to setting up the fully booted system.

[Will Fancher]

In NixOS, this step is currently done by this shell script that Fancher referred to as "scripted stage one", which encapsulates "the sequence of steps that needs to be taken by just about any installation of NixOS". The problem with this script is that it's serial and imperative. "It's all written in this ... shell-scripty kind of way. [...] It's all a little co-dependent. It's awkward to maintain, and it's frustrating to write." It's also "a lot of custom code" that NixOS can't really directly share with other projects.

Since 2022, Fancher has worked with other members of the Nix community on an alternative. Systemd is already used as PID 1 on a fully booted NixOS system; this new work also makes it part of the initial RAM disk used to set up the system. Using systemd from the beginning has several advantages, including that systemd is declarative, parallel, and that it provides "tools that come straight out of the systemd project, that we don't have to develop and maintain ourselves". These tools include niceties such as rescue and debug shells, systemd-networkd for configuring networking in a more robust way, systemd-ask-password to unify the interface for letting services request passwords on boot, and systemd-cryptsetup for supporting hardware keys for disk encryption.

Fancher gave an example from his own work where a server had a complicated tangle of different encrypted disks and networking services that were required to boot properly. He said that systemd lets you specify these kinds of complicated setups in an elegant way. He also pointed out that systemd's parallelism might speed up the early boot by preparing multiple disks simultaneously, although since this phase of the boot process is "only a few seconds for most systems", the impact won't be large.

The work to bring systemd to NixOS's early boot is quite far along. The option to enable the new mode was stabilized toward the end of last year in version 23.11, and Fancher hopes for it to become the default in version 24.05, which is expected in May. He said that there were still minor incompatibilities, but that those are detected at build time and a warning is issued while falling back to the existing code. He called out bcachefs, in particular, as something that does not yet work with the new setup, saying: "The reason it works in scripted stage one at all right now is basically luck." NixOS users wishing to try the new stage one implementation early can set boot.initrd.systemd.enable in their configurations.

Once the new mode becomes the default, version 24.11 (expected in November) will remove the networking code in the existing scripted stage one, which Fancher called out as particularly "janky". Assuming this goes as planned, version 25.05 will remove scripted stage one entirely. At the end of his talk, Fancher said that the work had involved a lot of collaboration, and said that a takeaway should be: "If there's a little thing that you want to contribute, then you should go ahead and contribute."

Module contracts

Pierre Penninckx gave a more theoretical talk calling for the Nix community to standardize a set of conventions for configuring services. Standing up a service that has been packaged for NixOS is easy; Penninckx showed how to run Nextcloud on a NixOS server:

    services.nextcloud.enable = true;

"Since the moment I saw that, I thought 'I need to use Nix for everything now'."

This ease comes at a cost, however. All of the other services that a particular service depends on are hard-coded by the maintainer of the NixOS module. For example, Nextcloud depends on NGINX, PostgreSQL, and a handful of other services. A user wishing to substitute Apache as the reverse proxy in front of Nextcloud would be out of luck unless they forked the NixOS module and wrote their own definition. This also means that there's no good way to reuse the configuration of smaller components between different modules.

[Pierre Penninckx]

To remedy this, Penninckx suggested having explicit contracts between modules. NixOS modules communicate with each other via configuration options in a central attribute set (Nix's equivalent of a dictionary). Explicit contracts between modules would be sets of configuration options that all the modules of a given type would agree to support — both for receiving and setting.

For example, two different database systems would support the same set of options for specifying that a service needs its own database user, and would produce the same set of options describing how to connect to the database in response. Each system could still have its own extra configuration options to cover things unique to that system, but they would support a common core of operations. This would make it easy to swap out two different equivalent services, giving more choice to the end user. "That's to me very powerful".

Penninckx pointed out that there are already a lot of implicit contracts between modules. For example, almost every service in NixOS supports a boolean enable option, to the point where users often don't even need to check the docs to know that the option will exist. Everybody expects enable to work in the same way between services, even though that's just an undocumented convention. There are several examples of patterns like that in nixpkgs that he pointed out.

Penninckx called on the community to spend some time planning how to formalize these contracts. He recognized that would take work, but thought that the effort would be worth it, and pointed out that the Nix community has a track record of pulling off long-term plans, such as the stage one changes from Fancher's previous talk. During Q&A, one member of the audience pointed out that the Kubernetes project already has a similar kind of abstraction, and suggested that perhaps the Nix project could coopt some of that design. Penninckx responded that "Yes, we could probably learn a trick or two from there".

The working examples that Penninckx has put together for his own use live in the Self Host Blocks project, which includes examples of how to set up contracts with automatic integration tests to make sure that the options have consistent effects across different services of the same type. It also includes "building blocks" for the services he uses — a collection he hopes to expand over time. He invited people who were interested in the topic to join the Matrix channel dedicated to the project to discuss what the next steps are.

State of the union

Toward the end of NixCon, a chaotic mix of presenters including Eelco Dolstra, original creator of Nix, and Ron Efroni, CEO of Flox and longtime Nix contributor, gave a Nix "State of the Union" talk for 2024. They went into how the Nix project has grown, what challenges the project is currently facing, and what people can do to contribute.

The general outlook was positive, with the community growing in every metric, including contributors, donations, and active users. However, there is one source of concern, which is whether Nix's funding can keep up. "This technology is not yet fully self-sustaining", Efroni warned. Much of Nix's funding comes from NLnet, but large amounts of support are also provided by various companies and individual donations.

LogicBlox, which is no longer an independent company, had provided support for NixOS since 2010 in the form of paying for the storage costs of NixOS's build-cache infrastructure. That cache has grown to 399.3TB of infrequently accessed data, and 111.5TB of recently accessed data comprising 793 million objects. In 2023, LogicBlox decided that it was unable to keep up with the growing demands of the cache, and that it could no longer continue its sponsorship. The Nix community pulled together and found alternative funding sources over the course of two weeks, but the incident was something of a wakeup call for the Nix foundation that Nix needs additional funding to be sustainable in the long term. "The goal of the foundation is to create a Nix that is sustainable," Efroni explained.

Dolstra took over the presentation at that point to explain the technical measures that the project plans to take to reduce the costs of the cache, including garbage collecting build artifacts of releases made more than two years ago. "The cache cannot keep growing forever, probably".

He clarified that they would keep the final version of each release, to make checking out historical versions of nixpkgs easier. He also stated that they would not be removing cached source code, so that rebuilding old versions of packages would remain possible. "The sources will still be there", Dolstra promised.

Finally, a succession of contributors took the stage to talk about all the different teams within the Nix project that are necessary to make it a success, including build infrastructure, architecture, moderation, marketing, and more. They all encouraged people to get involved by joining the Nix Discourse and Matrix space.

Overall, NixCon ran smoothly and drew dozens of people excited about the future of the project. The talks are currently available in the form of two recorded SCALE livestreams (stream 1, stream 2), but the SCALE YouTube channel will eventually have all the talks separated out and posted individually.

[I would like to thank LWN's travel sponsor, the Linux Foundation, for travel assistance to Pasadena for SCALE.]


Index entries for this article
ConferenceSouthern California Linux Expo/2024


to post comments

Nix at SCALE

Posted Mar 25, 2024 20:50 UTC (Mon) by gmgod (subscriber, #143864) [Link] (14 responses)

Happy to see all that work poured into Nix.

I'm personally especially excited by the systemd integration. The startup script will never achieve the features systemd has accumulated with the years.

That being said... Talking about "Nix at scale", what's the point using an abstraction you know does not scale before falling back on something ugly? (Talking about overlayfs' limit) I'd wager that that limit is there to prevent performance pitfalls.

It sounds like ostree/composefs would be better at doing that... And it can be used in a reproducible way (well ostree at least) from within docker/podman/buildah build... In short, overlayfs is probably not the right abstraction to pursue there.

Sounds to me like "so turns out people like knives and we can use one to put down screws it turns out but it does not scale and it has a few downsides". I'd just say, please use a screw driver, even if said scredriver is embedded in a knife.

Nix at SCALE

Posted Mar 25, 2024 21:05 UTC (Mon) by gmgod (subscriber, #143864) [Link] (10 responses)

Not sure how I feel about being explained how a rando wants to be referred to in an LWN technical article though. I'd have preferred that the author sticks with family name, to avoid any faux-pas, instead of getting into personal details that don't matter for the technical subject at hand.

That felt odd compared to the usual articles and is not written particularly clearly TBH. If you want to give more coverage to personal aspects, I'd simply suggest to put them in their own articles and keep technical stuff technical. :)

Nix at SCALE

Posted Mar 25, 2024 22:07 UTC (Mon) by mjg59 (subscriber, #23239) [Link]

As a contrary position, given the overlap between xer name and pronouns, I appreciated the clarification. When the subject of an article's pronouns are known, it makes sense to use them rather than change style purely to avoid their use.

Nix at SCALE

Posted Mar 25, 2024 23:35 UTC (Mon) by herrwiese (subscriber, #92825) [Link] (4 responses)

I'm sure, that a sequence of »Iaso think, Iaso says, then Iaso mentioned, …« wouldn't have had a positive effect on readability, either, so you *want* to use pronouns. As clunky as not using pronouns at all would (yet) have been using Isao's chosen pronouns without explanation. Not using them would be straight ignorant (one might find harsher words).

So yes, you might find this exxageratively sociologic for a technical article, but technical writing isn't detached from social facts.

Nix at SCALE

Posted Mar 26, 2024 13:26 UTC (Tue) by makendo (guest, #168314) [Link] (3 responses)

We should keep only "they".

Nix at SCALE

Posted Mar 26, 2024 14:26 UTC (Tue) by willy (subscriber, #9762) [Link] (2 responses)

Yes, telling people how they should be called has worked out so very well on numerous occasions.

Nix at SCALE

Posted Mar 26, 2024 15:22 UTC (Tue) by paulj (subscriber, #341) [Link] (1 responses)

It kinds of goes both ways really. Various factions could do with applying a bit more Postel principle - be courteous in using whatever title when you can, but also don't get worked up over people sometimes getting it wrong (cause... it's often not obvious, and it can be hard to keep up / know, and sometimes people prefer using extant and appropriate constructions).

E.g., English does already have a gender-neutral singular pronoun, with many hundreds of years of use. It should always be considered acceptable as a fall-back. (That shouldn't be controversial, but apparently there are people who get triggered over that).

Nix at SCALE

Posted Mar 26, 2024 16:32 UTC (Tue) by Wol (subscriber, #4433) [Link]

+1

In real life, I get rather upset - sometimes very vocally - when people get my name wrong, but I draw the distinction between "they didn't know/realise" and "deliberate / reckless". The guy who asked me my name over the phone, and then DIDN'T USE the name I told him ... I couldn't get off the phone quick enough. In a GMeet call, my name is on the screen - if you don't know how to address me - use that! If I don't like it, it's my fault for not getting it changed. But if you use the name on screen, and it's wrong, it's clearly completely innocent on their part.

Cheers,
Wol

Nix at SCALE

Posted Mar 25, 2024 23:59 UTC (Mon) by flussence (guest, #85566) [Link] (2 responses)

God how I miss the *normal* anti-systemd trolls instead of this alt-right filth...

Nix at SCALE

Posted Mar 26, 2024 0:47 UTC (Tue) by gmgod (subscriber, #143864) [Link] (1 responses)

Rather aggressive way to react to the opinion that technical stuff should remain technical.

I'm pretty happy, over here, to *not* constantly hear about Hans Reiser being in prison every time Reiserfs is mentioned, just like I'm glad "anger issues" are not mentioned when Linus is being talked about or who shagged who at the last conference.

I find it pretty ironic that someone who clearly read too much into what I wrote, tries to defend respect and open-mindedness by shitting on them. Please have a good look into the mirror next time you qualify something you imagine does not agree with you "filth".

Let's not go down this path

Posted Mar 26, 2024 1:20 UTC (Tue) by jake (editor, #205) [Link]

I don't think we are going to find this particular sub-thread useful to anyone involved, nor to the bystanders.

Let's please move on ...

jake

Nix at SCALE

Posted Mar 26, 2024 2:59 UTC (Tue) by titaniumtown (subscriber, #163761) [Link]

It's five words, I think you'll survive.

Nix at SCALE

Posted Mar 26, 2024 0:46 UTC (Tue) by geofft (subscriber, #59789) [Link] (2 responses)

What's the involvement of overlayfs and what's the relevant limit?

Nix at SCALE

Posted Mar 26, 2024 1:31 UTC (Tue) by gmgod (subscriber, #143864) [Link] (1 responses)

Overlayfs as used by the overlay2 Docker driver has a hard limit of 128 layers (actually in practice it's a bit less, I think there is an old issue about it on github, I know because I hit the limit).

I'm not sure if the limit is arbitrary or if there is a reason for it. All I know is that each layer is a level of indirection before files are actually opened. With lots of files and lots of layers, which was my use case, latencies were horrendous. Once the file is opened, it's fine... Native speed, really. Until the file is closed and the kernel forgot about it. Then you get latency spikes again.

From what I understood, the proposal here is to stack packages, one per layer, so that docker can be leveraged to produce composable images with nix. When the limit is reached (please have a look on your system for the number of packages you use to put this into perspective), the workaround is to use the last layer as a misc one, containing the rare packages.

It feels like shoehorning something that works pretty well (nix) into something that might be OK if we had MAX_UI64 layers (even then, I'm genuinely not sure, not because of size but because of indirection) when there are other ways to solve the original "kind-of" problem (i.e. how to shoehorn Nix in the Docker world, if possible while retaining reproducibility?). I just say "kind of" because I'm not sure Docker's way of distribution was the best for starters and because it's a shoehorning exercise: Nix works very well by itself.

Finally, if the matter is to provide Nix in containers in a repeatable and composable way, layers are not the only tools we have...

Nix at SCALE

Posted Mar 26, 2024 7:37 UTC (Tue) by pmarquesmota (subscriber, #156137) [Link]

It's line 73 & 81
Happy hacking!

Nix at SCALE

Posted Mar 26, 2024 15:04 UTC (Tue) by bjackman (subscriber, #109548) [Link] (1 responses)

If you'll forgive the Google-brained comment, Bazel is also a better "Docker image" builder than Docker's image builder. (https://github.com/bazelbuild/rules_docker)

I also put "Docker image" in quotes because all the other container runtimes are also better than Docker at solving their use case, and between them they seem to cover all the use cases. And container images are not runtime-specific these days.

Nix at SCALE

Posted Mar 26, 2024 15:05 UTC (Tue) by bjackman (subscriber, #109548) [Link]

(Wow, you can really tell I have Google Brain, because the thing I linked is already deprecated)

Nix at SCALE

Posted Mar 26, 2024 15:13 UTC (Tue) by pj (subscriber, #4506) [Link] (2 responses)

I've been using nix (the package manager) a bit over the last couple years and slowly learning more nix (the language). Anyone else get the feeling it's really an overgrown DSL that needs to be turned into like a lua extension or something?

Nix at SCALE

Posted Mar 26, 2024 16:06 UTC (Tue) by spacefrogg (subscriber, #119608) [Link]

It is a very particular language trying to work a very particular problem space efficiently: It solves the problem of efficiently extracting your working solution out of a mostly broken and incompatible code base. And doing that reliably, repeatably and re-trace-ably. It also solves a very heavy communications problem on the side, which is communicating "your working solution" to others effectively so they can benefit from it.

Nix language vs Scheme

Posted Mar 29, 2024 17:58 UTC (Fri) by bauermann (subscriber, #37575) [Link]

I never actually used Nix (I read the Nix PhD thesis so I have a passing familiarity with the language), but my impression is that this is one area where Guix really shines: it uses Scheme for package and services definitions (well, for almost everything actually) instead of creating an ad-hoc language.

This becomes apparent in this article's section about module contracts. In Guix, the problem described in the article becomes a simpler problem of refactoring code to make the different services more composable. Perhaps even creating a function that accepts packages as parameters and returns a service definition that uses those packages in the service instead of the default ones.

Though admittedly it does have a high learning curve, especially if one isn't used to Lisp.


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds