[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
|
|
Subscribe / Log in / New account

Leading items

Welcome to the LWN.net Weekly Edition for October 24, 2024

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Free-software foundations face fundraising problems

By Joe Brockmeier
October 23, 2024

In July, at the GNOME annual general meeting (AGM), held at GUADEC 2024, the message from the GNOME Foundation board was that all was well, financially speaking. Not great, but the foundation was on a break-even budget and expected to go into its next fiscal year with a similar budget and headcount. On October 7, however, the board announced that it had had to make some cuts, including reducing its staff by two people. This is not, however, strictly a GNOME problem: similar organizations, such as the Python Software Foundation (PSF), KDE e.V., and the Free Software Foundation Europe (FSFE) are seeing declines in fundraising while also being affected by inflation.

In April, GNOME Foundation board president Robert McQueen wrote on his blog that the foundation had been operating at a deficit for more than three years. That was possible because the foundation had received "some substantial donations" in the years prior, but the organization had now used up the surplus. GNOME has a reserves policy that requires it to keep enough in reserve to maintain core operations, which meant "the Board can't approve any more deficit budgets — to keep spending at the same level we must increase our income".

GNOME has not, however, increased its income. In fact, the organization brought in less money than expected during the fiscal year that ended on September 30. According to the board update, the problem was twofold, a "very challenging fundraising environment for nonprofits, on top of internal challenges" that included the departure of executive director Holly Million after less than one year.

The board published a follow-up on October 9, with more detail about the unexpected announcement of budget cuts. McQueen said that the board had been presented with a break-even budget for the year that ended on September 30, 2024. That budget had projected $1.201 million in revenue, and $1.195 million in expenses for the fiscal year. The revenue figure included an additional $475,000 that was expected to come in via donations, grants, and event sponsorships, but did not materialize.

Aiming for surplus

The budget for GNOME's current fiscal year, which runs October 1, 2024 through September 30, 2025, is about half of the prior year's budget. The foundation is projecting expenses of about $550,000 and income of about $586,000, its smallest budget since 2019 by more than $80,000. Those figures do not include the money that the foundation holds for GIMP, or money that it handles as part of a €1 million grant from Germany's Sovereign Tech Fund (STF), as neither are part of the foundation's general fund.

Nearly 50% of the budget comes from a $250,000 grant from Endless, and $200,000 of that is a targeted donation for specific work, such as running Flathub, maintenance of the GNOME Software application, work toward adding donations and payments for applications on Flathub, and more. Less than half of the foundation's income is expected to come from individual donations and fees for organizations to be on the GNOME advisory board.

The foundation is only allocating $10,000 of this year's budget for interim executive director Richard Littauer, who is expected to continue in that role until December 10. The foundation is currently searching for a new executive director, with a base salary between $120,000 and $150,000. GNOME hopes to find someone else to foot that bill, at least for the first year. McQueen said that the board is negotiating with a sponsor to cover the first year of a new director's salary. Whether GNOME will make a hire before Littauer departs is unclear.

GNOME's income has been highly variable in the past few years. The foundation brought in more than $870,000 in 2019, and more than $967,000 in 2020. In 2021 its income was less than $300,000, and 2022 the foundation brought in about $560,000. Some of that can be attributed to the pandemic, which meant in-person events (and accompanying sponsorships) did not take place. GNOME has not returned to its pre-pandemic income levels.

Viewed in isolation, it might appear that something is amiss with the GNOME Foundation in particular—but GNOME is not unique in seeing a decrease in donations. I sent questions to the PSF, KDE e.V., and FSFE to get a more complete picture of free-software fundraising in 2024. All of the organizations are facing fundraising or budget challenges, to varying degrees.

Python squeezed by inflation

Deb Nicholson, executive director of the PSF, said that the foundation had seen a small decrease in fundraising, coupled with inflation, "which no one can escape". The effects of inflation were particularly noticeable with respect to the costs of putting on PyCon US. One bright spot, Nicholson said, is that the PSF had seen more willingness by organizations to fund security work.

The foundation plans to do "a little belt-tightening" in 2025 "because we prefer to act proactively to maintain a comfortable financial cushion". How much tightening, though, is unclear—she said the PSF does not have a budget for 2025 to share at the moment.

Python is also operating with a much larger budget than GNOME. According to the 2023 annual report that was published in May, the PSF brought in about $4.36 million through its 2023 fiscal year and spent more than $4.5 million, for a deficit of about $152,000. However, it also has a comfortable cushion, with more than $5.4 million in assets on hand.

Eike Hein, treasurer for KDE e.V., said that there were parallels between GNOME and KDE aside from the obvious. For example, "both foundations have in past years received large one-time donations" that increased reserves. She said that KDE e.V. had an obligation under German nonprofit law to put resources towards its mission, and it had used the money to grow the footprint of the organization. For example, "by directly contracting more types of work, from event and project coordination to software development." That means the organization has been running an "intentional deficit for two years and going". It had hoped to spend down its reserves sooner, she said, but "COVID-19 made spending money difficult".

According to a report presented by KDE's financial working group, the project has seen "a lot of new fundraising" and income increased from €285,000 last year to €350,000 this year. Hein said that the organization had expanded the ways donors could contribute; since then, it has seen a large increase in recurring donations, and individual contributions have increased.

Expenses have also increased, from €385,000 to €475,000, the bulk of which are personnel expenses. Hein also noted that events are more expensive than ever:

For example, these days we generally have to rent or otherwise pay for the venue, even at universities; in the past, the community got more support for free. 2024 in particular has been a more difficult year for event sponsorship, also for LAS [Linux App Summit], with many recurring corporate sponsors having no budget this year.

Corporate sponsors tightening budgets is a recurring theme. She said that donations from corporate donors are stable, but "we've not seen new corporate sponsors in the last year and generally have the feeling that money is indeed tighter for many of them these days". Hein added that KDE e.V. has not raised its corporate membership prices "in about 20 years", making them relatively low compared to other organizations. For next year's budget, KDE e.V. expects to increase its income to €475,000 and start working toward a break-even budget in the 2026-2027 time frame.

Raising more to do the same

Matthias Kirschner, president of the FSFE, said that it was also having a challenging time raising money:

Fundraising has always been a challenging task, and in recent months, it has become even more difficult due to the current global situation. High inflation, political instability in and around Europe, widespread layoffs, and budget cuts in the IT sector have all had a direct impact on our ability to secure funding.

He also noted that inflation was driving up costs across the board, affecting salaries, infrastructure expenses, as well as travel. That means the organization has to raise more money just to maintain the same level of services. Raising more money, though, has been a challenge—especially as competition for limited resources increases. Kirschner said the feedback from grant providers indicates "there are now significantly more applications competing for the limited funding sources still available", which makes things even more challenging.

Hein said that there was public money available in Europe via programs like the Next Generation Internet (NGI) fund through NLnet, the Prototype Fund, STF, and others, but tapping into that money as an organization is difficult:

It's generally smarter to apply as individuals to many of these opportunities, as low per-project caps make splitting initiatives across multiple heads and grant applications maximizes payout - recommended strategy by the spokespeople ot several of these funds. For KDE e.V. this means we help facilitate grant application processes (e.g. letters of support for applicants outside Europe, who we can give our backing to as an EU-based org), but don't receive funding ourselves.

Kirschner noted that some of the public funds were also drying up, with significant cuts to NGI for next year. He was also more pessimistic than Hein about public monies in general, and said that the FSFE free-software organizations and individuals are "currently facing a significant lack of sustainable long-term public funding at both the member state and EU levels".

He also argued that the companies and individuals that consider withdrawing support during hard times should consider the long term:

If you value the work of non-profits dedicated to work for achieving software freedom, please consider making a donation to support them. Reducing your contributions to charitable non-profits during times of inflation and other crises threatens their long-term missions.

Looking forward

Even though GNOME is not the only organization facing environmental challenges for fundraising, it does have some unique problems in addressing them. Without a full-time executive director to work on fundraising, the organization is going to have a hard time bringing in new money to return to previous spending levels. It is also unusually dependent on a single donor, Endless. Diversification of funding is going to be increasingly important not just for GNOME, but for most if not all free and open-source organizations. Perhaps the Open Source Pledge, or something like it, will help drive more donations.

The GNOME Foundation hopes to launch a crowdfunding platform in the next year, which may be helpful. KDE's individual supporters account for about a quarter more than half of that organization's income, but individual supporters only contribute about 13% of GNOME's revenue. Doubling that figure wouldn't solve all of its problems, but it would be a start. If it can persuade more users to chip in, it may be able to offset the decline in corporate sponsorships. Hopefully GNOME will complete its search for an executive director soon, and the foundation will be able to put its budget woes behind it.

There are many other free and open-source organizations, beyond the four mentioned here, that are struggling for money today. Outreachy, for example, recently put out a call for help, saying that it has also found finding funding "extremely difficult this year". Some, such as the Open Collective Foundation (OCF), have already folded. The overall forecast for free-software fundraising looks like rough weather for the foreseeable future.

Comments (28 posted)

Toward safe transmutation in Rust

By Daroc Alden
October 23, 2024
RustConf 2024

Currently in Rust, there is no efficient and safe way to turn an array of bytes into a structure that corresponds to the array. Changing that was the topic of Jack Wrenn's talk this year at RustConf: "Safety Goggles for Alchemists". The goal is to be able to "transmute" — Rust's name for this kind of conversion — values into arbitrary user-defined types in a safer way. Wrenn justified the approach that the project has taken to accomplish this, and spoke about the future work required to stabilize it.

The basic plan is to take the existing unsafe std::mem::transmute() function, which instructs the compiler to reinterpret part of memory as a different type (but requires the programmer to ensure that this is reasonable), and make a safe version that can check the necessary invariants itself. The first part of Wrenn's talk focused on what those invariants are, and how to check them.

[Jack Wrenn]

The first thing to worry about is bit validity — whether every pattern of bits that can be produced by the input type is also valid for the output type. So, for example, transmuting bool to u8 is valid, because every boolean value is stored as one byte and therefore is also a valid u8. On the other hand, transmuting a u8 to a bool is invalid, because some values of u8 don't correspond to a bool (such as, for example, 17). The next invariant to worry about is alignment. Some types must be aligned to a particular boundary in memory. For example, u16 values must be aligned to even addresses on most platforms. Converting from one type to another is only valid if the storage of the type is aligned to a large enough boundary for values of the target type.

Code implementing transmutation in any language would need to worry about bit validity and alignment, but there are also two requirements for safe transmutation that are unique to Rust: lifetimes and safety invariants upheld by constructors. Both of these are related to the way that Rust can validate programmer-specified invariants using the type system. If a transmutation would break Rust's lifetime tracking, it is invalid. But it could also be invalid if it let someone construct a type that does not have a public constructor. For example, many Rust APIs hand out guard objects that do something when they are dropped. If a programmer could transmute a byte array into a MutexGuard for some mutex without locking it, that could cause significant problems. So transmutation should also not be used to create types that uphold safety requirements by having smart constructors.

Still — if the programmer can ensure that these four criteria are met, transmutation can be quite useful. Wrenn gave the example of parsing a UDP packet. In a traditional parser, the programmer would have to copy all of the data in the UDP header at least once in order to move it from the incoming buffer into a structure. But UDP headers were designed to be possible to simply interpret directly as a structure, as long as its fields have the correct sizes. This could let the program parse a packet without any copying whatsoever.

So it would be really nice to have safe transmutation. This has prompted the Rust community to produce several crates that provide safe abstractions around transmutation. The two that Wrenn highlighted were bytemuck and zerocopy. He is the co-maintainer of zerocopy, so he chose that crate to "pick on".

Both of these crates work by adding a marker trait, he explained — a trait which has no methods, and only exists so that the programmer can write type bounds that specify that a type needs to implement that trait to be used in some function. The trait is unsafe to implement, so implementing it is essentially a promise to zerocopy that the programmer has read the relevant documentation and ensured that the type meets the library's requirement. Then the library itself can include implementations for primitive types, as well as a macro to implement the marker trait for structures where it is safe to do so. This approach works. Google uses it in the networking stack for the Fuchsia operating system, he said.

But zerocopy has a "dirty secret": it depends on nearly 14,000 lines of subtle unsafe code, Wrenn warned. Worse, most of this code is repeating analyses that the compiler already has to do for other reasons. It would be more useful if this kind of capability came built-in to the compiler.

"Project Safe Transmute"

All of this is what motivated the creation of "Project Safe Transmute", Wrenn said. That project is an attempt to bring native support for safe transmutation to the Rust compiler.

That effort is based around a particular "theory of type alchemy", Wrenn explained. The idea is to track whether all possible values of one type are also possible values of another. For example, a NonZeroU8 can be converted to a u8 without a check, but not vice versa. But determining this kind of relationship automatically is trickier than it might initially appear. Performing the analysis naively, by reasoning in terms of sets of possible values, quickly becomes inefficient. Instead, the compiler models a type as a finite-state machine, Wrenn said. Each field or piece of padding in the type becomes a state, with edges representing valid values. Therefore all values are represented by a path through the machine, and can be worked with using relatively straightforward algorithms, but the representation does not blow up in size as a type gets more complicated.

With this theory in place, it was practical to implement this analysis in the compiler. So Wrenn and his collaborators implemented it, resulting in the following trait that is automatically implemented on the fly by the compiler for any two compatible types:

    unsafe trait TransmuteFrom<Src: ?Sized> {
        fn transmute(src: Src) -> Dst
        where
            Src: Sized,
            Self: Sized;
    }

Since this work is integrated into the compiler, attempting to convert two types that are not compatible will give a custom error message explaining why. The compiler checks all four requirements Wrenn described previously — which is exactly the source of the next problem. How can the compiler know whether a user-defined type has safety requirements that are checked by a constructor? It can't, so it must conservatively assume that user-defined types cannot be the target of a transmutation (although they can still be the input to one).

This "isn't all that useful", though. Transmuting things into user-defined types was a requirement for the use cases Wrenn had discussed. It turns out that often what people want is not safe transmutation, but safer transmutation. So the people working on transmutation added an extra generic parameter to the TransmuteFrom trait that the programmer can use in order to promise the compiler that one or more of the safety requirements is met, even if the compiler cannot prove that. The parameters are Assume::VALIDITY for bit-validity, Assume::ALIGNMENT for alignment, Assume::LIFETIMES for lifetimes, and Assume::SAFETY for user safety invariants. Now, it is possible to target user types by giving a Assume::SAFETY parameter to the operation:

    #[repr(transparent)]
    pub struct Even {
        // The compiler doesn't know about the following,
        // but our code depends on this for some reason:
        // SAFETY: Always an even number!
        n: u8
    }

    fn u8_to_even(src: u8) -> Even {
        assert!(src % 2 == 0)
        unsafe { TransmuteFrom::<_, Assume::SAFETY>::transmute(src) }
    }

It may seem as though requiring the use of unsafe to do transmutation represents a lack of progress. But this design has the advantage that the programmer only needs to assert the safety of the specific invariant that the compiler is unable to prove — the above code still uses the compile-time checks for bit-validity, alignment, and lifetime problems. So the work, which is available for testing on nightly, doesn't make transmutation completely safe, but it does provide "effective safety goggles" to make sure that as much as possible is checked by the compiler, and that therefore the programmer only needs to check the things that are genuinely not possible for the compiler to ascertain.

Future outlook

Wrenn ended by summarizing the future work needed to polish the feature: supporting dynamically sized types, adding an API for fallible transmutation, optimizing the implementation of the bit-validity checks in the compiler, improving the portability of type layouts, and finally stabilizing the work. He hopes that TransmuteFrom might have an RFC for stabilization in 2025, but said that it needed testing and feedback before that, and called on the audience to provide that testing. Whether users will find this API to be an improvement over the existing crates remains to be seen, but it seems clear that transmutation is too useful not to support as part of Rust itself in some way.

Comments (17 posted)

Python PGP proposal poses packaging puzzles

By Joe Brockmeier
October 21, 2024

Sigstore is a project that is meant to simplify and improve the process of signing, verifying, and protecting software. It is a relatively new project, declared "generally available" in 2022. Python is an early adopter of sigstore; it started providing signatures for CPython artifacts with Python 3.11 in 2022. This is in addition to the OpenPGP signatures it has been providing since at least 2001. Now, Seth Michael Larson—the Python Software Foundation (PSF) security developer-in-residence—would like to deprecate the PGP signature and move to sigstore exclusively by next year. If that happens, it will involve some changes in the way that Linux distributions verify Python releases, since none of the major distributions have processes for working with sigstore.

PGP and sigstore

No doubt many readers already have some experience with using implementations of OpenPGP, Pretty Good Privacy (PGP) or GNU Privacy Guard (GPG), to sign and verify artifacts. PGP signatures have been used for decades to provide proof that tarballs, packages, ISO images, and so forth are genuine. The terms PGP and GPG are used interchangeably in the various discussion threads and elsewhere, but we will stick with "PGP" for simplicity's sake.

To sign an artifact, a developer needs to create a PGP key pair and publish the public key for interested parties to use to verify the signature. Anyone can create a key pair that claims to be from "Foo Project Release Manager <rm@foo-project.org>", use it to sign artifacts, and publish the public key near the artifact. Users have had to rely on published keys to judge whether the key used belongs to the person (or organization) that is supposed to have signed the artifact. The ideal scenario is being able to validate a key fingerprint via the "web of trust", but that can be difficult.

In some ways, PGP signing is part of open-source community culture. It used to be common to have signing parties at events and Linux User Group (LUG) meetings to help expand the web of trust. People would examine each others' passport or other identifying documentation, collect public-key fingerprints, and then sign each others' keys to affirm they belonged to the person in question. There are fewer key-signing parties at events these days, though they're still held at events like DebConf 2024.

The use of PGP has waned, but it is still actively used by upstream projects and Linux distributions. However, there are also many complaints about the complexity of managing and verifying PGP keys, as well as complaints about the deficiencies in the web-of-trust concept. The Linux kernel community has had to implement its own solution to maintain the kernel's web of trust in 2019, following a series of attacks on public keyservers.

Many developers would like PGP to go away, but there have been few alternatives to replace it. Sigstore is now being used by some projects to do just that. It was started by Red Hat to solve the problem of providing signatures for container images, it moved to the Linux Foundation in 2021, and then moved again to the foundation's Open Source Security Foundation (OpenSSF). The project offers several tools and services for signing, verifying, and enforcing policy based on sigstore metadata.

With sigstore, a developer does not need to create or maintain a key pair (though it is possible to do so). Instead, they only need to have an account with a provider, such as GitHub, GitLab, Google, or Microsoft, that uses OpenID Connect (OIDC) to verify identity. Note that those are the default providers, but it is possible to set up other OIDC providers. The developer then uses a client, such as cosign, that requests a certificate from the OpenID provider to verify the developer's identity. Assuming the developer authenticates successfully with the OpenID provider, a short-lived certificate is issued for signing the artifact and then the attestation is published to a public ledger called Rekor. Cosign is also used to verify signatures. The sigstore documentation has a detailed overview of the process and the various sigstore components. An open-access research paper is available for those who would like to go deeper into the concept.

In short, sigstore removes the need for developers to have, maintain, and distribute PGP keys. It replaces the web of trust with centralized authorities for verifying the identity of artifact signers. Some of the downsides are that it relies on a handful of providers and requires tooling changes for those who have automated PGP signature checking.

Proposal

Larson started a "pre-PEP" discussion on the Python forum about dropping PGP signatures on September 25. Despite its newness, he made the case that sigstore had been adopted by other major projects and services, such as the Python Package Index (PyPI) (which dropped PGP support in 2023), npm, and GitHub, and was "likely to stick around". From a technical perspective, he said it made sense for those signing artifacts as well as those verifying them:

Sigstore has the benefit of not requiring release managers to maintain and protect separate long-lived expiring secret keys to maintain the integrity of Python artifacts, instead release managers only need to maintain control of an email account (something release managers already need to do). On the verifier side, Sigstore doesn't require discovering and fetching keys which might become unavailable, out of date, etc and doesn't require downloading an artifact to verify its integrity.

After some pre-PEP discussion, Larson published a draft of PEP 761 ("Deprecating PGP signatures for CPython artifacts") for discussion on October 9, and asked downstream users of the PGP signatures to weigh in.

If the PEP is accepted, CPython would phase out PGP signatures for the next major Python release, 3.14, due in October 2025. PGP signatures would still be published for updates to earlier CPython releases until those releases reach the end of their life. So, for example, the just-released Python 3.13 will continue to have PGP signatures for each update until it is discontinued in October 2029.

In the PEP, Larson argues that providing PGP and sigstore signatures fails to give downstream projects any incentive to adopt sigstore. So long as CPython continues to provide PGP signatures, there is little motivation to adopt sigstore.

Discussion

Larson's proposal drew support from Python contributors on the forum. Hugo van Kemenade, a CPython core developer, said that he would be the first release manager (RM) to be affected by the change and was strongly in favor of it. "Sigstore looks a much better solution, we already have it in place, and I'd much rather spend my RM time on things that are more useful for our users". He also noted that he had not needed to use PGP for two decades: "since Nokia (remember them?) used to mail us Symbian (remember that?) SDKs encrypted on DVDs (remember those?)".

Steve Dower, also a CPython core developer, weighed in that PGP signatures for Windows and macOS Python builds are "entirely redundant with the native OS embedded signatures we use", but that people complained when the project stopped providing PGP signatures for those builds: "the only reason we still do them is because people shouted last time we stopped". Once signatures were reinstated, he said, the shouting stopped.

Distributions weigh in

There has been no shouting in the discussions, but there have been concerns raised by maintainers from several Linux distributions. Miro Hrončok, Fedora's Python maintainer, said that Fedora verifies PGP signatures of Python tarballs when building packages, but said there were blockers to using sigstore to do the same. Specifically, he said that sigstore was missing offline verification and that the Python sigstore components had too many dependencies.

Larson pointed to the sigstore-go client as a potential solution to both problems. The discussion also spurred work on a longstanding issue of adding offline support to the sigstore-python tool. A release with offline support was uploaded to PyPI on October 10. Sigstore's cosign does support air-gapped or offline verification, though it requires downloading the trust root ahead of time to be able to use the --offline parameter.

Debian developer Stefano Rivera admitted that Debian had not been using PGP to verify CPython, because the package maintainers never got around to setting it up, or didn't consider it important. He said that he intended to ensure PGP signatures were checked in the future, and that a lot would need to happen for Debian to be able to support sigstore:

As to sigstore, we'd need to support it in our uscan tool, to have automated verification happen. To include the signature in our source packages, as PGP signatures are, we'd need to modify dpkg-source, and possibly the archive, to support storing them. It's hard to motivate that change for a single upstream. I know there are other ecosystems supporting it, but I've never run into any doing so actively.

Despite the extra work, though, Rivera said that he thought there was agreement on goals, and it would be a matter of "convincing people that sigstore is worth the effort of implementation in Debian package tooling." He started a discussion on debian-devel about supporting sigstore, as well as other mechanisms such as SSH signatures and signify.

Debian developer Guillem Jover provided some interesting thoughts on the matter. He was disappointed in the upstream discussion:

I find this usual conflation (in the upstream discussion) of GPG (or GnuPG) as if it was OpenPGP itself rather problematic, because the OpenPGP ecosystem is way richer than that, and as such way more active, and there has been and there is lots of work going on to implement and provide more usable, intuitive, secure and modern interfaces and code [...]

He was open to exploring other technologies for signing, however. He suggested adding "partial" support for other methods, such as updating the uscan script that downloads sources and verifies signatures, but not going so far as adding support to dpkg. Rivera agreed and said that Debian might "talk about adding support to dpkg later, when we can see significant use".

Sam James, a Gentoo developer, recommended that Python continue using PGP signatures for one more release cycle to allow those downstream of Python to adapt and provide feedback:

We didn't have sigstore packaged at all before now and had to package a lot of new dependencies for it which took time. Others will surely be in the same position. We also have a lot of tooling around PGP and none for sigstore yet.

Larson thanked James for providing feedback and said that the final timeline was up to the steering council. He asked a few follow-up questions about Gentoo's timeline for starting to package 3.14, what blockers exist beyond offline verification, and when Gentoo might know what those blockers would be.

On October 12, Gentoo developer Michał Górny committed a series of patches to verify sigstore signatures for Gentoo, answering the question of when Gentoo might be able to adapt to sigstore. Hrončok said that Fedora would need to package cosign before it would be able to use it for verifying artifacts, but didn't say whether that is going to happen or not.

OpenSUSE developer Matěj Cepl brought up the topic on openSUSE's factory mailing list, and asked for thoughts on using sigstore. So far there have only been a few responses, and none have been enthusiastic about sigstore.

Next steps

The discussion is still ongoing, so it is possible that the steering council will decide to defer dropping PGP signatures to give downstream projects more time. Packagers from the major distributions have been quick to jump into this discussion, but it is asking a lot to require distributions to adapt to this change in less than 12 months. One thing that is clear from the discussions is that a few years is not enough time for a technology like sigstore to gain widespread acceptance and understanding. At least, not without forcing the conversation. Larson seems to be doing a good job of getting that ball rolling, while working collaboratively to clear obstacles from his goal of sigstore adoption.

Whether Python drops PGP signatures in 3.14 or a later release, it seems likely that it will drop them at some point in the not-so-distant future, and that Linux distributions will need to add sigstore verification to their repertoire. That will likely help drive its adoption further, which means that it's time for users of open-source software and Linux distributions to start paying closer attention to the technology and its implications.

Comments (116 posted)

The long road to lazy preemption

By Jonathan Corbet
October 18, 2024
The kernel's CPU scheduler currently offers several preemption modes that implement a range of tradeoffs between system throughput and response time. Back in September 2023, a discussion on scheduling led to the concept of "lazy preemption", which could simplify scheduling in the kernel while providing better results. Things went quiet for a while, but lazy preemption has returned in the form of this patch series from Peter Zijlstra. While the concept appears to work well, there is still a fair amount of work to be done.

Some review

Current kernels have four different modes that regulate when one task can be preempted in favor of another. PREEMPT_NONE, the simplest mode, only allows preemption to happen when the running task has exhausted its time slice. PREEMPT_VOLUNTARY adds a large number of points within the kernel where preemption can happen if needed. PREEMPT_FULL allows preemption at almost any point except places in the kernel that prevent it, such as when a spinlock is held. Finally, PREEMPT_RT prioritizes preemption over most other things, even making most spinlock-holding code preemptible.

A higher level of preemption enables the system to respond more quickly to events; whether an event is the movement of a mouse or an "imminent meltdown" signal from a nuclear reactor, faster response tends to be more gratifying. But a higher level of preemption can hurt the overall throughput of the system; workloads with a lot of long-running, CPU-intensive tasks tend to benefit from being disturbed as little as possible. More frequent preemption can also lead to higher lock contention. That is why the different modes exist; the optimal preemption mode will vary for different workloads.

Most distributions ship kernels built with the PREEMPT_DYNAMIC pseudo-mode, which allows any of the first three modes to be selected at boot time, with PREEMPT_VOLUNTARY being the default. On systems with debugfs mounted, the current mode can be read from /sys/kernel/debug/sched/preempt.

PREEMPT_NONE and PREEMPT_VOLUNTARY do not allow the arbitrary preemption of code running in the kernel; there are times when that can lead to excessive latency even in systems where minimal latency is not prioritized. This problem is the result of places in the kernel where a large amount of work can be done; if that work is allowed to run unchecked, it can disrupt the scheduling of the system as a whole. To get around this problem, long-running loops have been sprinkled with calls to cond_resched(), each of which is an additional voluntary preemption point that is active even in the PREEMPT_NONE mode. There are hundreds of these calls in the kernel.

There are some problems with this approach. cond_resched() is a form of heuristic that only works in the places where a developer has thought to put it. Some calls are surely unnecessary, while there will be other places in the kernel that could benefit from cond_resched() calls, but do not have them. The use of cond_resched(), at its core, takes a decision that should be confined to the scheduling code and spreads it throughout the kernel. It is, in short, a bit of a hack that mostly works, but which could be done better.

Doing better

The tracking of whether a given task can be preempted at any moment is a complicated affair that must take into account several variables; see this article and this article for details. One of those variables is a simple flag, TIF_NEED_RESCHED, that indicates the presence of a higher-priority task that is waiting for access to the CPU. Events such as waking a high-priority task can cause that flag to be set in whatever task is currently running. In the absence of this flag, there is no need for the kernel to consider preempting the current task.

There are various points where the kernel can notice that flag and cause the currently running task to be preempted. The scheduler's timer tick is one example; any time a task returns to user space from a system call is another. The completion of an interrupt handler is yet another, but that check, which can cause preemption to happen any time that interrupts are enabled, is only enabled in PREEMPT_FULL kernels. A call to cond_resched() will also check that flag and, if it is set, call into the scheduler to yield the CPU to the other task.

The lazy-preemption patches are simple at their core; they add another flag, TIF_NEED_RESCHED_LAZY, that indicates a need for rescheduling at some point, but not necessarily right away. In the lazy preemption mode (PREEMPT_LAZY), most events will set the new flag rather than TIF_NEED_RESCHED. At points like the return to user space from the kernel, either flag will lead to a call into the scheduler. At the voluntary preemption points and in the return-from interrupt path, though, only TIF_NEED_RESCHED is checked.

The result of this change is that, in lazy-preemption mode, most events in the kernel will not cause the current task to be preempted. That task should be preempted eventually, though. To make that happen, the kernel's timer-tick handler will check whether TIF_NEED_RESCHED_LAZY is set; if so, TIF_NEED_RESCHED will also be set, possibly causing the running task to be preempted. Tasks will generally end up running for something close to their full time slice unless they give up the CPU voluntarily, which should lead to good throughput.

With these changes, the lazy-preemption mode can, like PREEMPT_FULL, run with kernel preemption enabled at (almost) all times. Preemption can happen any time that the preemption counter says that it should. That allows long-running kernel code to be preempted whenever other conditions do not prevent it. It also allows preemption to happen quickly in those cases where it is truly needed. For example, should a realtime task become runnable, as the result of handling an interrupt, for example, the TIF_NEED_RESCHED flag will be set, leading to an almost immediate preemption. There will be no need to wait for the timer tick in such cases.

Preemption will not happen, though, if only TIF_NEED_RESCHED_LAZY is set, which will be the case much of the time. So a PREEMPT_LAZY kernel will be far less likely to preempt a running task than a PREEMPT_FULL kernel.

Removing cond_resched() — eventually

The end goal of this work is to have a scheduler with only two non-realtime modes: PREEMPT_LAZY and PREEMPT_FULL. The lazy mode will occupy a place between PREEMPT_NONE and PREEMPT_VOLUNTARY, replacing both of them. It will, however, not need the voluntary preemption points that were added for the two modes it replaces. Since preemption can now happen almost anywhere, there is no longer a need to enable it in specific spots.

For now, though, the cond_resched() calls remain; if nothing else, they are required for as long as the PREEMPT_NONE and PREEMPT_VOLUNTARY modes exist. Those calls also help to ensure that problems are not introduced while lazy preemption is being stabilized.

In the current patch set, cond_resched() only checks TIF_NEED_RESCHED, meaning that preemption will be deferred in many situations where it will happen immediately from cond_resched() in PREEMPT_VOLUNTARY or PREEMPT_NONE mode. Steve Rostedt questioned this change, asking whether cond_resched() should retain its older meaning, at least for the PREEMPT_VOLUNTARY case. Even though PREEMPT_VOLUNTARY is slated for eventual removal, he thought, keeping the older behavior could help to ease the transition.

Thomas Gleixner answered that only checking TIF_NEED_RESCHED is the correct choice, since it will help in the process of removing the cond_resched() calls entirely:

That forces us to look at all of them and figure out whether they need to be extended to include the lazy bit or not. Those which do not need it can be eliminated when LAZY is in effect because that will preempt on the next possible preemption point once the non-lazy bit is set in the tick.

He added that he expects "less than 5%" of the cond_resched() calls need to check TIF_NEED_RESCHED_LAZY and, thus, will need to remain even after the transition to PREEMPT_LAZY is complete.

Before then, though, there are hundreds of cond_resched() calls that need to be checked and, for most of them at least, removed. Many other details have to be dealt with as well; this patch set from Ankur Arora addresses a few of them. There is also, of course, the need for extensive performance testing; Mike Galbraith has made an early start on that work, showing that throughput with lazy preemption falls just short of that with PREEMPT_VOLUNTARY.

It all adds up to a lot to be done still, but the end result of the lazy-preemption work should be a kernel that is a bit smaller and simpler while delivering predictable latencies without the need to sprinkle scheduler-related calls throughout the code. That seems like a better solution, but getting there is going to take some time.

Comments (38 posted)

A report from the 2024 Image-Based Linux Summit

October 22, 2024

This article was contributed by Luca Boccassi

The Image-Based Linux Summit has by now established itself as a yearly event. Following on from last year's edition, the third edition was held in Berlin on September 24, the day before All Systems Go! 2024 (ASG). The purpose of this event is to gather stakeholders from various engineering groups and hold friendly but lively discussions around the topic of image-based Linux — that is, Linux distributions based around immutable images, instead of mutable root filesystems.

The format of the event consists of a series of BoF sessions held in sequence, on topics chosen by the attendees. Organizers Luca Boccassi and Lennart Poettering welcomed participants from the Linux Userspace API (UAPI) Group, who work for companies or on projects such as Microsoft, Canonical/Ubuntu Core, Debian, GNOME OS, Fedora, Red Hat, SUSE, Arch Linux, mkosi, Flatcar, NixOS, carbonOS, postmarketOS, Pengutronix, and Edgeless Systems.

Progress since the previous summit

The first order of business was letting participants summarize what they achieved on topics of interest since last year's summit. The UAPI Group's web site and GitHub organization have had more specifications added, including one that precisely defines the pattern to use for configuration file handling on a hermetic-usr system. That specification formalizes what projects such as systemd and libeconf already use.

Systemd

Perhaps unsurprisingly, the systemd project implemented a lot of new features in the two major releases that went out since the previous summit. One of the most important pieces of work was the implementation of the systemd-pcrlock tool, which aims to solve a major gap in the measured-boot story, namely how to deal with inherently local platform-configuration registers (PCRs) that are not under the control of the OS vendor. Poettering presented at ASG on this topic the following day (the video is available here). Once this tool is refined and ready for production use, it should push the measured-boot story on Linux much closer to completion. There were many other changes of course, and a "State of the Project" talk at ASG attempted to provide an overview.

Mkosi also saw several major updates, and can now run fully unprivileged. It has also dropped the use of bubblewrap to provide a mkosi-sandbox tool instead. Support for OpenSSL engines and providers was added to sign artifacts, and mkosi-initrd is now fully integrated to allow building initrds from packages for use in unified kernel images (UKIs). It can now produce artifacts that can be directly consumed by systemd-sysupdate. Support for new distributions, such as Azure Linux, was added.

Distributions

Distribution vendors and maintainers have been busy too. Flatcar has now adopted System Extensions (sysexts) as a way to extend its production deployments and to simplify user customisation. It also integrates systemd-sysupdate as a complementary service to let operators update custom extensions at their own pace. NixOS has fully integrated systemd-repart into its build system, including support for having a dm-verity signed /nix/store, and UKIs are available by default. Edgeless is trying to make progress on a proposal to write a shared specification for package-manager lockfiles that can be shared across multiple projects. The company is also still working on its Uplosi tool for uploading images to cloud providers.

OpenSUSE has implemented full disk encryption bound to the TPM using signed policies and pcrlock, added support for soft-reboot using the Btrfs-based transactional-updates mechanism, and provides systemd-boot as an option in the image installer. GNOME OS has made significant investments to improve and integrate systemd-homed and systemd-sysupdate, thanks to a grant from the Sovereign Tech Fund, and also started using sysexts for testing system components during development as part of its continuous-integration system. Red Hat made progress on the automotive use case by supporting dm-verity for the base image in the osbuild image-building tool. It is also working on the bootc project to make bootable container images.

Linux Plumbers Conference

The week before the summit, the 2024 Linux Plumbers Conference was held in Vienna, and many UAPI group members participated. They also organized one of the microconferences, the Kernel ↔ Userspace/Init/System Management boundaries and APIs MC. The experience was positive and the event was productive, with many topics of interest covered. One notable topic was a discussion about how to refactor the kernel's handling of initrd, with the end goal of being able to enforce an immutable, read-only initrd at run time, rather than the unpacked tmpfs that is currently used. This would avoid copying and the need to delete contents before the transition to the real root filesystem.

Kernel

A relevant update on the kernel side was the Integrity Policy Enforcement Linux Security Module (IPE LSM) being accepted for inclusion upstream during the 6.12 merge window. This new LSM lets image-based Linux deployments ship a code-integrity policy enforced by the kernel, so that only signed (and thus trusted) payloads can be executed at run time. Enabling this feature was always one of the goals of developing image-based Linux products, and a demo showing how this can work was given at ASG.

Dual-boot and the discoverable-partitions specification

After the updates were given, the participants discussed the compatibility of the discoverable-partitions specification (DPS) with dual booting and operating systems' ownership of their respective partitions. A new installer for GNOME OS has been introduced that no longer supports traditional /etc/fstab configurations. This change has raised questions about identifying which root and /usr partitions belong to which distribution.

To address this, the idea is that the root-filesystem discovery will be based on pattern matching on labels, though this feature has not yet been fully incorporated into the specifications. Corresponding /var partitions are identified through hashing of the machine ID, which is problematic when building images with mkosi as the ID would need to be fixed at build time, which is the opposite of how it is supposed to be used. This limitation prompted questions about production adoption of DPS; for instance, SteamOS has not yet integrated it due to issues with discovering the complete partition set.

Proposals were made to enhance partition identification through label-based matching and filtering, ensuring backward compatibility with systems that do not use labels. The need to support multiple versions of the same OS (as opposed to just different OSes) was also noted, along with potential solutions for specifying root filesystems in configurations using UKIs and systemd credentials locked to the TPM.

Stateless OpenPGP verification

The next discussion focused on establishing a generic pattern to use Stateless OpenPGP for the verification of distribution artifacts, including repositories and packages. Participants identified numerous pitfalls associated with the current use of GnuPG, particularly its non-compliance with the latest standard and the statefulness of keyrings. APT, the package manager used by Debian, Ubuntu, and derivative distributions, currently supports a directory hierarchy under /etc/apt/trusted.gpg.d for OpenPGP keyring files. A similar but generalized scheme that could be adopted by various producers and consumers of keys would be ideal.

A proposal was made to explore additional technologies, such as PKCS #7, allowing for greater flexibility in how keys are managed and used across distributions. This would facilitate better integration with systemd for artifact authentication during downloads. The discussions emphasized the importance of establishing a clear specification for key management, ensuring that keys have designated purposes and are stored in a structured directory hierarchy, following the common /etc -> /run -> /usr pattern for discovery. The directory structure would indicate what the key is for (e.g. APT or systemd-sysupdate), but the policy defining how the key can be used (to sign packages, etc.) should be inside the key itself. The design of such a specification is currently in progress in the UAPI Group repository.

Kernel-enforced restrictions for unsigned filesystems

The need for robust security features in systemd was underscored, particularly regarding access to unauthenticated filesystems. Proposals included the implementation of a BPF LSM program to reject access to unauthenticated filesystems (so only authenticated filesystems, such as those protected with dm-verity and dm-crypt, or kernel-provided virtual filesystems such as procfs and sysfs would be permitted) and deny access to device nodes outside of /dev. Additionally, the rejection of AF_UNIX sockets in unexpected locations like /etc and /usr was proposed.

The community was encouraged to submit requests for enhancements to track these proposed security policies. Enhancing the visibility of loaded programs in the BPF filesystem was also discussed, which would aid in managing filesystem policies more effectively. As an alternative, the recently merged IPE LSM could be enhanced to provide such controls, and, in fact, it already does provide such a feature at a proof-of-concept stage inside Azure Boost.

Combining FIDO2 and TPM2 for authentication

The relationship between FIDO2 and TPM2 technologies was a significant point of discussion. Participants explored the potential of combining TPM2 and FIDO2 as a two-factor authentication mechanism.

A TPM2 policy can enforce that a challenge-response type of authentication takes place before a secret can be unlocked. This could be used to send the challenge to a FIDO2 device, but it should also work with a PKCS#11 hardware security module. This is the scheme that ChromeOS already supports, so it appears to be a viable option.

A scheme based on Shamir's secret sharing was also discussed, and an implementation had even started to take form. The main downside compared to the previous option is that combining the key shards has to happen in main memory and be implemented by the CPU, while the other scheme lets the security chips handle this, which is safer.

Challenges of immutable systems and added complexity for contributors

How to deliver an immutable system without raising complexity for contributors was another point of discussion. The challenges of building images for postmarketOS, especially locally, were highlighted.

Plans are in motion for loading sysexts early in the boot process (while still in the initrd phase), so that they can be applied immediately on the rootfs. The idea of having a local writable layer that could be "committed" to a sysext was also floated. Another option could be to perform full image builds and sign them with local keys, with a fast reboot mechanism (provided by soft-reboot).

While this bypasses some security models, of course, it may serve as a way to let developers use the systems they are building while allowing for shorter development cycles, which are fundamental for productivity.

ChromeOS and NixOS both provide a "developer mode" where security requirements are relaxed, to allow for such workflows without impacting the security of production deployments. This is sometimes called a "break glass" mode. The GNOME OS developers were looking into providing such a feature, but there was interest in implementing this directly in systemd, instead, so that it can be integrated with the TPM.

Systemd on musl

The adaptation of systemd for use with musl libc garnered attention, particularly in the context of postmarketOS. The challenges faced by contributors were discussed, highlighting the need for collaboration to address the technical hurdles involved in porting systemd to this environment.

The current plan of record is for the postmarketOS developers to provide a shim library that implements the APIs missing from musl that are needed by systemd, such as pidfd_spawn(), gshadow(), and additional printf() formatters capabilities. These features are closely tied to the libc and should really be implemented by libc authors.

Discussions also touched on the need for better management of /etc as a writable configuration context. Suggestions included persisting the machine ID and exploring solutions for managing presets, such as mounting /var from initrd.

The complexities of overlay filesystems and their interaction with writable configurations were explored, with participants suggesting that early mounting of /var during the boot process could mitigate some of these issues.

The /etc dilemma

As one of the most often-recurring topics in this area, it would have been strange if it hadn't been discussed. The question of how to handle /etc on immutable systems is one that has many possible answers, some more complete than others.

Even on a fully immutable system, some files, like the machine ID, are inherently local to a specific installation and cannot be part of the rootfs. These should remain stable across reboots so they cannot be ephemeral either. There are ad-hoc solutions, like setting systemd.machine_id=firmware when booting a VM so that a machine ID can be generated from a VM UUID set by the hypervisor. Another proposed solution for physical machines could use the TPM or a sealed system credential to persist the machine, instantiated on first boot. But such an approach cannot scale, naturally.

The main issue is that any updated /etc files need to be visible from the beginning of the rootfs boot phase, but the most common solutions to mount data partitions such as /var do it as part of the same boot phase, so any files stored in /var and symbolically linked, bind-mounted, or otherwise made available on the rest of the system, are not visible from the beginning. A proposed solution to this problem would be to ensure /var is mounted already by the early boot process, before switching root. In fact, SUSE MicroOS already prepares its /etc overlay in the initrd, so there is a working precedent for such a setup. It might be time for systemd to take care of this issue and generalize the move to mounting /var in the initrd, so that the various OSes and distributions can employ their preferred mechanism to update files in /etc, be that via confexts, snapshots, or overlays. Another workaround discussed involves using confexts in mutable mode, and either writing changes to /etc directly or redirecting writes to a staging directory to generate a new confext from.

Progress on hermetic /usr

Being closely related to the /etc dilemma, the efforts to push forward the hermetic /usr concept were also discussed at length. While significant progress has been made, challenges remain due to a small number of projects' resistance to the proposed changes. On a minimal base system, some configuration files that the GNU C library (glibc) uses (/etc/services, ldconfig, and nsswitch) are the last remaining items to address; the glibc maintainers are amenable to accepting patches if someone were to work on them and there are plans to make this happen.

Outside of such a minimal setup, there are various strategies to deal with the lack of support for default configuration for other programs in /usr, which is less problematic as it tends to be a late-boot problem. A common solution is to ship /usr/share/factory/etc and create symbolic links via tmpfiles.d to link the configuration files into /etc. Another solution is to use overlayfs to layer configuration storage directories in /usr or /var on top of /etc, which is what SUSE MicroOS and Flatcar do.

Unprivileged image mounting and user ranges assignment

Systemd recently introduced the mountfsd and nsresourced services that allow unprivileged users to mount verified images and to request user namespaces with pre-mapped UID/GID ranges. Previously, this had to be done via tools like newuidmap that use setuid, but it is known this approach is prone to security problems, since the caller controls the execution environment. Nsresourced is an interprocess-communication service (using varlink), so its execution environment is set up by systemd, like any other system service.

Work on these components is not done yet, though, and some challenges remain, such as how to assign ranges for different use cases, especially without knowing in advance what will be deployed. Dynamic assignment is problematic due to clashes, and having to manually configure the assignments is cumbersome. The proposed solution is to assign a predefined range of UIDs/GIDs that all containers will use. Since they are static and pre-defined, one doesn't need to know in advance what the situation on the system where the container will be deployed is, greatly simplifying setups. All dynamic ranges will get mapped to this predefined range.

One of the remaining issues is that nesting is not possible, although it seems that work is planned to solve this problem in the kernel.

Another issue is that, given that users do not own the files on the filesystem using this static range, mountfsd will need to gain the ability to clean them up. This seems like a solvable problem with a new API designed for this purpose. Likewise, only images are handled now, and mountfsd should be enhanced to also be able to mount directories for users. Compared to the problem of getting buy-in from various projects to the idea of using the static, fixed range to build images, these technical challenges seem easy.

ESP resizing

UKIs require more storage space than the EFI system partition (ESP) was originally planned to provide, so many existing installations are not large enough, especially once addons and extensions are factored in. The boot loader specification introduced the extended boot loader partition for this reason, so that existing systems can gain additional storage space without having to reformat their drives.

But sometimes this is not enough either, and there is a strong desire to be able to dynamically extend the ESP. The problem is that there is nothing that can resize a VFAT filesystem in place, so this problem comes up often for discussion. Android was the next topic discussed. It ran into a similar issue with its OS partitions and solved it by concatenating partitions at the kernel level using dm-linear. An idea was proposed to implement something similar using a special GPT partition type and algorithm for deriving partition UUIDs. But, so far, nobody has stepped up to attempt to implement this strategy, nor would this help with the ESP, so a solution to this problem remains elusive for now.

Factory reset

Factory reset is implemented in user space, with a special target that services can be hooked into and that can be booted to. Systemd-repart also has support for deleting data partitions and recreating them. But this is only part of the picture, as nowadays there will be data on the ESP too, in the form of credentials, addons, extensions, and self-signed images, so a strategy to deal with those is also needed.

Managing the ESP is tricky as it could be shared among multiple OSes, and it might store vendor data that might be necessary to boot the machine, which should not be deleted. The agreed solution is to come up with a separate "vendor" directory for addons, extensions, and other artifacts that will never be removed on factory reset.

The TPM should also be reset, and fortunately an API already exists that can be called to queue such an operation for the next reboot. Integration in user space is required, but should be fairly straightforward.

And speaking of integration, a way to tie all of these mechanisms together is still needed. A proposal was made to allow users to request a factory reset directly from the boot menu. This reset process would trigger comprehensive system resets, including TPM resets and systemd-repart's factory-reset functionalities, and this should fill all of the gaps in the current implementation.

Customizing the boot process via credentials instead of the kernel command line

Projects implementing immutable systems largely rely on the boot loader to show options to users, to let them pick the desired snapshot, generation, or image to boot. The kernel command line is used as the medium to pass this information to the services in the initrd that set these systems up.

The problem is that the kernel command line is a kitchen sink; it is parsed by anything and everything, and used for diverse things, with no separation or namespacing. And, of course, it is also parsed and used by the kernel. An unprivileged user gaining access to the kernel command line could have catastrophic consequences for a system. The kernel even parses it before ExitBootServices has been called, so even the firmware is part of the attack surface.

The proposed solution is to switch to systemd credentials instead. These are scoped, individual, and targeted, so only the user-space service that needs a credential will receive it. And, of course, the kernel does not parse these credentials, so the attack surface is greatly diminished. There are two issues with this approach: first of all, the tooling is not up to scratch yet, and there is no GUI for selecting a credential or a subset of credentials to apply to a system when booting. Secondly, user-space programs have largely not yet been enhanced to use them.

The first problem appears to be more difficult, as implementing a usable and friendly GUI in the bootloader is no easy task, especially for one that is able to display a large matrix of possible choices in a way that is usable. The second problem is technically simpler, as systemd makes it really simple to opt in and use a credential, but requires more work to convince projects to adopt credentials. Having a fully implemented end-to-end story for credentials will probably be required before more projects take the plunge and adopt them as an alternative to the kernel command line for configuration.

Conclusions

The day concluded as planned, with all participants agreeing it was productive and that work should continue on the UAPI Group and ancillary projects, and that the event should be repeated next year. The next immediate goal will be preparing for the Image-Based Linux devroom at FOSDEM 2025, hoping to repeat the success of the 2023 edition. The full minutes of the summit have been published on the UAPI Group web site. Pushing image-based Linux projects forward is not a single set of tasks but an ongoing process, one that requires participation and coordination from many projects, companies, and groups, and the Image-Based Linux Summit is the ideal forum for such activities.

Comments (none posted)

A look at the aerc mail client

By Joe Brockmeier
October 17, 2024

Email has become somewhat unfashionable as a collaboration tool for open-source projects, but there are still a number of projects—such as PostgreSQL and the Linux kernel—that expect contributors to send and review patches via email. The aerc mail client is aimed at developers looking for a text-based, efficient, and extensible client that is meant to be used for working with Git and email. It uses Vim-style keybindings by default, and has an interface inspired by tmux that lets users manage multiple accounts, mails, and embedded terminals at once.

Why is it called that?

By terminal-based email client standards, aerc is relatively new. The popular Mutt text-based email client was first released in 1995, and its fork NeoMutt was first released in 2016. The venerable Pine mail client was first released in 1992, and its rewrite Alpine appeared in 2007. At only six years old, aerc is a new kid on the block. Drew DeVault made the first commit to the aerc project in January 2018, and announced the 0.1.0 pre-release in June 2019. The project is written in Go, and made available under the MIT License.

The name was something of a mystery, so I emailed DeVault to ask where it came from. He replied that it originally stood for "asynchronous email reading client", to signify that the user interface and the network code to communicate with IMAP servers, etc., were decoupled to make the user experience better. That name, he said, was not very meaningful and "soon fell into the dustbin of history".

DeVault posted a notice to his aerc repository in January 2022 that development had moved to a fork maintained by Robin Jarry. That fork has seen active development ever since. Aerc 0.18.0, announced in July, is the most recent major release. That represented about five months' work, with 25 people contributing to the release. It included a lot of minor fixes and small enhancements, such as a new :bounce command for message resending, an :align command for positioning a message at the top, bottom, or center of the message list, and more. The most recent minor release (0.18.2) came out on July 29 with a small number of fixes.

Aerc is packaged for most, if not all, major Linux distributions and is also available in the Ports system for FreeBSD and OpenBSD as well as NetBSD's packages collection. I started out with the package in Fedora 40 but moved to compiling from source a few weeks later to get the most recent updates. So far, the main branch has been perfectly stable.

Retrieving and sending mail

Aerc currently supports reading email using IMAP, JMAP, notmuch, Maildir, and Maildir++ as backends. It works with SMTP and Sendmail for sending mail. Aerc provides a new-account command that will walk the user through a text-based configuration wizard to set up aerc to send and receive email.

The new-account utility only configures the bare minimum to retrieve and send emails. It is likely that users will want to edit the account configuration to add a signature, modify the interval for checking mail, use GnuPG for signing, and so forth. Account settings are stored in a plain-text file found in $HOME/.config/aerc/accounts.conf. See the accounts.conf man page for more information. All of aerc's configuration files are stored under .config/aerc by default.

A basic configuration looks something like this:

    [Account Name]
    source = maildir://~/mymail
    outgoing = smtps://user:<password>@smtp.myhost.com
    default = INBOX
    from = Aerc User
    copy-to = Sent
    signature-file = ~/.signature

Note that aerc will store IMAP, JMAP, and SMTP passwords in plain text in its configuration file during setup, which is not ideal from a security perspective. Users may want to use the source-cred-cmd directive in accounts.conf to run an external command, such as pass, to retrieve the password from an encrypted source.

The aerc interface

Aerc is started by running aerc in a terminal. Unlike most text-based mail clients, aerc has a tabbed interface and allows users to have multiple accounts and emails open at the same time. It displays the account's folders on the left-hand side of the screen and emails on the right-hand side. If a user has more than one account configured, each one will be displayed in its own tab. Navigation is done using Vim-style keybindings: k and j to move up and down (respectively) in the message list, K and J to move up and down the folder list, and Enter to open a message or folder. Control-n and Control-p cycle between tabs.

As with Vim, the prefix to run commands is the colon key (:). New users may wish to try out the tutorial by running :help tutorial, which displays the tutorial man page with some basic instruction on its movement keys, the message viewer, composing messages, and the built-in terminal.

[The aerc mailer]

Typing : and hitting the Tab key displays a pop-up of commands that are available, which users can select by hitting Tab again until the desired command is highlighted, or typing the first few letters and hitting Tab to complete the command. For example, :compose will open the message composer, :reply -a -q will start a reply with the original email quoted in the default editor, and :terminal will open a new terminal within aerc.

Just as Vim has modes, such as normal mode for running commands, insert mode for editing text, and so forth, aerc has contexts. The available contexts for aerc include messages (viewing the list of messages in an account), view (actually viewing a single message), compose, and terminal. Users can use ? to display a list of key bindings that are available in whatever context they happen to be in.

Customizing aerc

As one might expect, aerc's keybindings are customizable and users can create keybindings to run command sequences of their own. Its keybinding configuration is found in $HOME/.config/aerc/binds.conf. The bindings are organized by context, and it is possible to use a binding to do different things depending on the context that aerc is in. The format for bindings is simple, the key sequence to be used and the command to be run. As an example, this would set up a shortcut to move a message to the spam folder in a user's work account:

    [messages:account=Work]
    S = :read<Enter>:move Spam<Enter>

All of aerc's default keybindings are defined in this file as well—so users can reconfigure things to their heart's content. It should be possible, for example, to set up aerc to use Mutt shortcuts instead or even Emacs-style keybindings. (Though Emacs users would probably be loath to exit Emacs to use an external mail client.)

Users can hide mailbox folders and/or change the order in which they are displayed by adding folders and folders-sort directives to the accounts.conf file under the appropriate account. Users can also remap folder names to be more useful. For example, when using aerc with a Gmail account, users might want to remap the default IMAP names like "[GMAIL]/Spam" to just "Spam". (Or "spam spam spam spam..." if they've watched entirely too much Monty Python's Flying Circus.) For example, this would tell aerc to only display the inbox, Sent, and Archive folders for an account, in that order, and to look to the file specified by folder-map to find the remapping for "[Gmail]/Sent" to Sent.

    folders = INBOX,Archive,Sent
    folders-sort = INBOX,Sent,Archive
    folder-map = /home/user/.config/aerc/folders

In the folders file, this stanza would remap the folder name:

    Sent = [Gmail]/Sent

Much of aerc's interface is customizable as well. For example, it's possible to rearrange the width and order of message columns (from, subject, date, etc.), the width of the folder sidebar, and much more. Basic configuration is available in the aerc.conf file (see the aerc-config manpage for more information), and users can style the user-interface colors and such using stylesets. The styleset feature is not merely for prettifying aerc's interface, though it's certainly suitable for that purpose. Stylesets can be used to highlight messages that match certain conditions, so that they're easier to spot in the message list. For instance, these stanzas will color messages that match "FR" (for review) in the subject, and messages sent to lists.debian.org or lists.postgresql.org:

    msglist_*.Subject,~FR.fg=#a64b2a
    msglist_*.To,~lists.debian.org.fg=#4e6a79
    msglist_*.To,~lists.postgresql.org.fg=#f0ece2

Running :reload after saving new rules will let them take effect without needing to restart aerc. The catppuccin/aerc repository has a few examples of stylesets that aerc users might want to use or borrow from.

Using aerc

Even though aerc is a terminal-based and keyboard-driven application, it has mouse support (assuming the terminal it is running in supports it). This can be enabled by setting mouse-enabled to true in aerc.conf. It allows selecting tabs, folders, and messages using the mouse, and scrolling through the message list with the mouse wheel. Beyond that, though, aerc has no menus to speak of—so the utility of the mouse is limited.

Reaching for the mouse is a productivity speed bump anyway. It's never really necessary to reach for the mouse while using aerc, everything a user would want to do is available via the keyboard. Almost everything, anyway. Emails and attachments can be piped to external commands using the :pipe and :open commands from within aerc. If an email contains a link the user would like to visit, the :open-link command will send it to the default web browser.

Aerc does not support using the mbox format but it can import and export the format with the :import-mbox and :export-mbox commands. This might be handy for moving mail out of another program into Aerc, or for importing archives from mailing lists.

Editing messages is handled by the external editor of the user's choice. By default, aerc will use whatever editor is specified by $EDITOR, but this can be modified in the aerc.conf file using the editor directive, like so:

    editor=nano

One of the primary use cases for aerc is collaborating with other people using Git via email. The expectation is that developers would send patches using Git's send-email command in a terminal running within aerc, and manage patches using the :patch command, which applies patches to a Git repository directly from within aerc. The sourcehut tutorial for using email and Git provides a good overview of using git send-email and the corresponding tutorial for reviewing contributions explains how to use aerc to review patches and provides a sample review project to work with.

Searching

Aerc has two commands that are useful for searching through one's email when using IMAP or a Maildir backend: :search and :filter, which have the same syntax but slightly different behaviors. Searching highlights matching messages and lets the user navigate to matching messages using n (forward) and N (backward). When filtering, only matching messages are displayed—which makes it handy to use to operate on messages in bulk. This command, for example, would match all messages in a folder that match the term Firefox in the last week:

    :filter -d this_week -b Firefox

Note that it's possible to run :filter more than once to narrow results, but a new :search replaces the old results. For example, one could run the above command to filter messages sent in the last week that contain the word "Firefox" in the body. Running another :filter command would further narrow the results rather than running a new search on the folder. Filtered results can be searched, too. Running :filter with no arguments will clear the filters and show all messages in the folder. Note that filtering is persistent until aerc is restarted or cleared—if a user filters messages in their Inbox folder and then switches to another folder and back, the filter in the Inbox would still be active. That may run counter to expectations and provoke a "where did all my mail go?" (or stronger) reaction if one is new to aerc.

If using the notmuch backend, users can work with the notmuch syntax for the :filter and :search commands. Aerc also has a :query command for creating a virtual folder from a notmuch query.

The aerc project realizes that even developers have to take meetings on occasion. If a message has a meeting invite, users can use the :accept and :decline commands to generate a reply to the calendar invite that should let the sender update their calendar accordingly. Note that this does not update the aerc user's calendar, which will need to be done manually by piping the text/calendar part to a separate handler.

Overall, aerc is a highly customizable mail client with a small but active community behind it. I've found it very usable for managing multiple accounts with large amounts of email. There was a slight decline in productivity during the first week or two while tinkering with settings and keybindings, but that passes quickly. Performance-wise, aerc handles large amounts of mail stored locally very well. Sorting, searching, filtering, and operating on thousands of messages at once is almost instantaneous—though mileage may vary depending on the system resources available to aerc. It also performs well with an IMAP backend, though that is somewhat dependent on the service. Working with mail stored on LWN's server using IMAP has been notably faster than Fastmail, but I have more than a decade's worth of email stored on Fastmail's servers. It may be well past time to delete some of the newsletters and automated notices from previous decades from my archives.

It is worth signing up for the aerc-discuss and aerc-devel mailing lists for users who adopt aerc as their primary mail client. Both lists are low-volume and have useful discussions about using and improving aerc.

Despite the low version number, aerc has proven stable and feature-complete enough to for daily use. It will be interesting to see how the project evolves in the near future, and if it ever reaches a 1.0 release.

Comments (5 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>


Copyright © 2024, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds