Leading items

Welcome to the LWN.net Weekly Edition for November 3, 2022

This edition contains the following feature content:

Modernizing Fedora's C code: the time has come to move all packages shipped by Fedora up to the C99 standard.
Moving past TCP in the data center, part 1: TCP has served us well for a long time, but John Ousterhout thinks that a replacement is needed for data-center use.
Copyright notices (or the lack thereof) in kernel code: a disagreement over whether the placement of copyright notices in the kernel source is appropriate.
Still waiting for stackable security modules: a decade-long development effort might finally be getting closer to completion.
Packaging Rust for Fedora: another chapter in the ongoing saga over the impedance mismatch between the Linux distribution model and modern programming languages.

This week's edition also includes these inner pages:

Brief items: Brief news items from throughout the community.
Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Modernizing Fedora's C code

By Jake Edge
November 2, 2022

It is not often that you see a Fedora change proposal for a version of the distribution that will not be available for 18 months or so, but that is exactly what was recently posted to the mailing list. The change targets the C source code in the myriad of packages that the distribution ships; it would fix code that uses some ancient compatibility features that were removed by the C99 standard but are still supported by GCC. As might be guessed from the long runway proposed, there is quite a bit of work to do to get there.

As usual with Fedora change proposals, this one was posted to the Fedora devel mailing list on behalf of its owner, Florian Weimer, by Fedora program manager Ben Cotton; it is also available in an updated form on the Fedora wiki. At the moment, Fedora 37 is imminent, but the proposal targets Fedora 40, which is currently slated for the northern-hemisphere Spring of 2024. The goal, as described by the title is "Porting Fedora to Modern C".

Old C

There are several C features that were removed from the language in the C99 standard, but are still accepted by default in GCC, so use of them exists in the extensive code base that makes up Fedora. The idea would be to start working on cleaning those up, hopefully in collaboration with other distributions, and getting those changes into the upstream projects. There are six different constructs that are targeted, but the most important is removing implicit function declarations:

This legacy compatibility feature causes the compiler to automatically supply a declaration of type int function() if function is undeclared, but called as a function. However, the implicit return type of int is incompatible with pointer-returning functions, which can lead to difficult debugging sessions if the compiler warning is missed, as described here.

On 64-bit systems, implicitly returning a 32-bit int can result in a truncated pointer as was described in the linked blog post. Similarly, functions returning _Bool that are called after an implicit int declaration may have their return values misinterpreted because they do not modify all 32-bits of the return register.

Another problem identified is that, before C99, the following code would implicitly define two int variables since the type is not specified:

    static i = 1.0;
    auto j = 2.0;

Detecting and removing those two "implicit-int" features is complicated by the use of Autoconf in many of the packages that are part of Fedora:

Neither change is trivial to implement because introducing errors for these constructs (as required by C99) alters the result of autoconf configure checks. Quite a few such checks use an implicitly declared exit function, for instance. These failures are not really related to the feature under test. If the build system is well written, the build still succeeds, the relevant features are automatically disabled in the test suite and removed from reference ABI lists, and it's not immediately apparent that feature is gone. Therefore, some care is needed that no such alterations happen, and packages need to be ported to C99.

As noted in a message on the LLVM forum site, the Autconf scripts that are included with a project may rely on the implicit declarations and any compiler warnings or errors may get lost because of the way Autoconf is run. Tools are being developed to help find the problem code without upsetting the Autoconf apple cart. Collaboration with other distributions should also help in that process, the proposal said.

The other changes would fix some crufty constructs. The bool, true, and false keywords would be enforced; code that defines its own versions of those symbols would need to change. In addition, GCC does not treat the assignment of pointers to int variables as an error, which needs to be fixed. Old-style function definitions, such as:

    int sum(a, b)
        char *a;
        int b;
    { ...

would need cleaning up as well, the proposal said. And finally, the use of empty parentheses on function declarations needs to be tightened up:

In earlier C versions, a declaration int function() declares function as accepting an unspecified number of arguments of unknown type. This means that both function(1) and function("one") compile, even though they might not work correctly. In a future C standard, int function() will mean that function does not accept any parameters (like in C++, or as if written as int function(void) in current C). Calls that specify parameters will therefore result in compilation errors.

The benefits for Fedora are twofold, according to the proposal. Bugs that are sometimes hard to track down and "look eerily like compiler or ABI bugs" will be avoided because warnings that might be easily overlooked would become errors. In addition, it is believed that some of these legacy features are holding back progress in GCC. The proposal is looking forward to the release of GCC 14, which is expected to be included in Fedora 40. There is a belief that GCC 14 will disable support for those features by default, but even if that does not happen, Fedora 40 would change its defaults so that regressions are not introduced afterward.

Questions

Daniel P. Berrangé noted that Autoconf difficulties immediately sprang to mind when he saw the proposal. He pointed out that other build systems that probe for features by compiling up test programs of various sorts may also be affected. Weimer said that he had already found a problem of that sort in the Python setuptools. It generates test programs to determine whether certain functions are present in the environment, but those programs rely on implicit int, so they may fail with a strict compiler, but not because the function of interest is missing.

Berrangé had also asked about how developers could test their code to see if it has any of these problems; it is not simply setting -std=gnu99 since that will still allow the old constructs (for now at least). Weimer spelled out the list of flags needed ("It's -Werror=implicit-int -Werror=implicit-function-declaration. Best to throw in -Werror=int-conversion -Werror=strict-prototypes -Werror=old-style-definition."); he also updated the proposal on the wiki so that each of the flags was tied to the obsolete feature it would catch.

Alexander Sosedkin wondered why the porting effort couldn't be distributed and proceed at a more leisurely pace. To start, every package could be marked as not supporting the change and the defaults for GCC could be switched for all packages that do support it; nothing would change at that point, but packages could slowly make the changes needed and toggle their flag. Though Weimer wishes it could be done that way, he really does not think that a distributed porting effort is feasible; "we'd have to teach many more people about C arcana and autoconf corner cases. I don't think that's a good learning investment to be honest."

Vit Ondruch asked about the status of the effort and the number of packages that will be affected by it. Weimer said that he was still figuring that out, but his preliminary run found about 10% of the Fedora packages were affected, which he believes is an overestimate. That certainly seems like a daunting number, though, since Fedora has more 60,000 packages in its repository, the bulk of which are written in C.

There were no major complaints heard about the change in the thread. Gentoo is working on a similar set of changes for Clang 16, which already rejects the legacy C constructs; that should cut the work roughly in half, with luck. More distributions joining the party could reduce it further still. The Fedora Engineering Steering Committee (FESCo) will need to consider the proposal to decide whether the distribution should pursue it; as yet, the proposal is still in the discussion phase. It's a lot of work—thankless work, in truth—but would help modernize the code for lots of projects. Fedora would obviously benefit from that modernization as well.

Comments (26 posted)

Moving past TCP in the data center, part 1

By Jake Edge
November 1, 2022

Netdev

At the recently concluded Netdev 0x16 conference, which was held both in Lisbon, Portugal and virtually, Stanford professor John Ousterhout gave his personal views on where networking in data centers needs to be headed. To solve the problems that he sees, he suggested some "fairly significant changes" to those environments, including leaving behind the venerable—ubiquitous—TCP transport protocol. While LWN was unable to attend the conference itself, due to scheduling and time-zone conflicts, we were able to view the video of Ousterhout's keynote talk to bring you this report.

The problems

There has been amazing progress in hardware, he began. The link speeds are now over 100Gbps and rising with hardware round-trip times (RTTs) of five or ten microseconds, which may go lower over the next few years. But that raw network speed is not accessible by applications; in particular, the latency and throughput for small messages is not anywhere near what the hardware numbers would support. "We're one to two orders of magnitude off—or more." The problem is the overhead in the software network stacks.

If we are going to make those capabilities actually available to applications, he said, some "radical changes" will be required. There are three things that need to happen, but the biggest is replacing the TCP protocol; he is "not going to argue that's easy, but there really is no other way [...] if you want that hardware potential to get through". He said that he would spend the bulk of the talk on moving away from TCP, but that there is also a need for a lighter-weight remote-procedure-call (RPC) framework. Beyond that, he believes that it no longer makes sense to implement transport protocols in software, so those will need to eventually move into the network interface cards (NICs), which will require major changes to NIC architectures as well.

There are different goals that one might have for data-center networks, but he wanted to focus on high performance. When sending large objects, you want to get the full link speed, which is something he calls "data throughput". That "has always been TCP's sweet spot" and today's data-center networks do pretty well on that measure. However there are two other measures where TCP does not fare as well.

For short messages, there is a need for low latency; in particular, "low tail latency", so that 99% or 99.9% of the messages have low round-trip latency. In principle, we should be able to have that number be below 10µs, but we are around two orders of magnitude away from that, he said; TCP is up in the millisecond range for short-message latencies.

Another measure, which he has not heard being talked about much, is the message throughput for short messages. The hardware should be able to send 10-100 million short messages per second, which is "important for large-scale data-center applications that are working closely together for doing various kinds of group-communication operations". Today, in software, doing one million messages per second is just barely possible. "We're just way off." He reiterated that the goal would be to deliver that performance all the way up to the applications

Those performance requirements then imply some other requirements, Ousterhout said. For one thing, load balancing across multiple cores is needed because a single core cannot keep up with speeds beyond 10Gbps. But load balancing is difficult to do well, so overloaded cores cause hot spots that hurt the throughput and tail latency. This problem is so severe that it is part of why he argues that the transport protocols need to move into the NICs.

Another implied requirement is for doing congestion control in the network; the buffers and queues in the network devices need to be managed correctly. Congestion in the core fabric is avoidable if you can do load balancing correctly, he argued, which is not happening today; "TCP cannot do load balancing correctly". Congestion at the edge (i.e. the downlink to the end host) is unavoidable because the downlink capacity can always be exceeded by multiple senders; if that is not managed well, though, latency increases because of buffering buildup.

TCP shortcomings

TCP is an amazing protocol that was designed 40 years ago when the internet looked rather different than it does today; it is surprising that it has lasted as long as it has with the changes in the network over that span. Even today, it works well for wide-area networks, but there were no data centers when TCP was designed, "so unsurprisingly it was not designed for data centers". Ousterhout said that he would argue that every major aspect of the TCP design is wrong for data centers. "I am not able to identify anything about TCP that is right for the data center."

He listed five major aspects of the TCP design (stream-oriented, connection-oriented, fair scheduling, sender-driven congestion control, and in-order packet delivery) that are wrong for data-center applications and said he would be discussing each individually in the talk; "we have to change all five of those". To do that, TCP must be removed, at least mostly, from the data center; it needs to be displaced, though not completely replaced, with something new. One candidate is the Homa transport protocol that he and others have been working on. Since switching away from TCP will be difficult, though, adding support for Homa or some other data-center-oriented transport under RPC frameworks would ease the transition by reducing the number of application changes required.

TCP is byte-stream-oriented, where each connection consists of a stream of bytes without any message boundaries, but applications actually care about messages. Receiving TCP data is normally done in fixed-size blocks that can contain multiple messages, part of a single message, or a mixture of those. So each application has to add its own message format on top of TCP and pay the price in time and complexity for reassembling the messages from the received blocks.

That is annoying, he said, but not a show-stopper; that you cannot do load balancing with TCP is the show-stopper. You cannot split the handling of the received byte-stream data to multiple threads because the threads may not receive a full message that can be dispatched and parts of a message may be shared with several threads. Trying to somehow reassemble the messages cooperatively between the threads would be fraught. If someday NICs start directly dispatching network data to user space, they will have the same problems with load balancing, he said.

There are two main ways to work around this TCP limitation: with a dispatcher thread that collects up full messages to send to workers or by statically allocating subsets of connections to worker threads. The dispatcher becomes a bottleneck and adds latency; that limits performance to around one million short messages per second, he said. But static load balancing is prone to performance problems because some workers are overloaded, while others are nearly idle.

Beyond that, due to head-of-line blocking, small messages can get trapped behind larger ones and need to wait for the messages ahead of them to be transmitted. The TCP streams also do not provide the reliability guarantees that applications are looking for. Applications want to have their message delivered, processed by the server, and a response returned; if any of those fail, they want some kind of error indication. Streams only deliver part of that guarantee and many of the failures that can occur in one of those round-trip transactions are not flagged to the application. That means applications need to add some kind of timeout mechanism of their own even though TCP has timeouts of various sorts.

The second aspect that is problematic is that TCP is connection-oriented. It is something of an "article of faith in the networking world" that you need to have connections for "interesting properties like flow control and congestion control and recovery from lost packets and so on". But connections require the storage of state, which can be rather expensive; it takes around 2000 bytes per connection on Linux, not including the packet buffers. Data-center applications can have thousands of open connections, however, and server applications can have tens of thousands, so that storing the state adds a lot of memory overhead. Attempts to pool connections to reduce that end up adding complexity—and latency, as with the dispatcher/worker workaround for TCP load balancing.

In addition, a round-trip is needed before any data is sent. Traditionally, that has not been a big problem because the connections were long-lived and the setup cost could be amortized, but in today's microservices and serverless worlds, applications may run for less than a second—or even just for a few tens of milliseconds. It turns out that the features that were thought to require connections, congestion control and so on, can be achieved without them, Ousterhout said.

TCP uses fair scheduling to share the available bandwidth among all of the active connections when there is contention. But that means that all of the connections finish slowly; "it's well-known that fair scheduling is a bad algorithm in terms of minimizing response time". Since there is no benefit to handling most (but not all) of a flow, it makes sense to take a run-to-completion approach; pick some flow and handle all of its data. But that requires knowing the size of the messages, so that the system knows how much to send or receive, which TCP does not have available; thus, fair scheduling is the best that TCP can do. He presented some benchmarking that he had done that showed TCP is not even actually fair, though; when short messages compete with long messages on a loaded network, the short messages show much more slowdown ("the short messages really get screwed").

The fourth aspect of TCP that he wanted to highlight is its sender-driven congestion control. Senders are responsible for reducing their transmission rates when there is congestion, but they have no direct way to know when they need to do so. Senders are trying to avoid filling up intermediate buffers, so the congestion signals are based on how full the buffers are in TCP.

In the extreme case, queues overflow and packets get dropped, which causes the packets to time out; that is "catastrophic enough" that it is avoided as much as possible. Instead, various queue-length indications are used as congestion notifications that the sender uses to scale back its transmission. But that means there is no way to know about congestion without having some amount of buffer buildup—which leads to delays. Since all TCP messages share the same class of service, all messages of all sizes queue up in the same queues; once again, short-message latency suffers.

The fifth aspect of the TCP design that works poorly for data centers is that it expects packets to be delivered in the same order they were sent in, he said; if packets arrive out of order, it is seen as indicating a dropped packet. That makes load balancing difficult both for hardware and software. In the hardware, the same path through the routing fabric must be used for every packet in a flow so that there is no risk of reordering packets, but the paths are chosen independently by the flows and if two flows end up using the same link, neither can use the full bandwidth. This can happen even if the overall load on the network fabric is low; if the hash function used to choose a path just happens to cause a collision, congestion will occur.

He hypothesizes that the dominant cause of congestion in today's data-center networks is this flow-consistent routing required by TCP. He has not seen any measurements of that, but would be interested; he invited attendees who had access to data-center networks to investigate it.

Processing the packets in software also suffers from this load-balancing problem. In Linux, normally a packet will traverse three CPU cores, one where the driver code is running, another where the network-stack processing is done (in a software interrupt), and a third for the application. In order to prevent out-of-order packets, the same cores need to be used for all of the packets in a flow. Like with the hardware, though, if two flows end up sharing a single core, that core becomes a bottleneck. That leads to uneven loading in the system; he has measured that it is the dominant cause of software-induced tail latency for TCP. That is also true for Homa on Linux, he said.

There is a question of whether TCP can be repaired, but Ousterhout does not think it is possible. There are too many fundamental problems that are interrelated to make that feasible. In fact, he can find no part of TCP that is worth keeping for data centers; if there are useful pieces, he would like to hear about them. So, in order to get around the "software tax" and allow applications to use the full potential of the available networking hardware, a new protocol that is different from TCP in every aspect will be needed.

That ended the first half of Ousterhout's keynote; next up is more on the Homa transport protocol that has been developed at Stanford. It has a clean-slate protocol design specifically targeting the needs of data centers. Tune in for our report on that part of the talk in a concluding article that is coming soon.

Comments (63 posted)

Copyright notices (or the lack thereof) in kernel code

By Jonathan Corbet
October 27, 2022

The practice of requiring copyright assignments for contributions to free-software projects has been in decline for years; the GNU Binutils project may be the latest domino to fall in that regard. The Linux kernel project, unlike some others, has always allowed contributors to retain their copyrights, resulting in a code base that has widely distributed ownership. In such a project, who owns the copyright to a given piece of code is not always obvious. Some developers (or their employers) are insistent about the placement of copyright notices in the code to document their ownership of parts of the kernel. A series of recent discussions within the Btrfs subsystem, though, has made it clear that there is no project-wide policy on when these notices are warranted — or even acceptable.

In early September, a patch series implementing fscrypt integration for the Btrfs filesystem included this patch adding, among other things, a one-line Facebook copyright notice. Btrfs maintainer David Sterba replied with a request to limit copyright information to SPDX tags; he cited a page in the Btrfs wiki, asserting that these tags are a complete replacement for copyright notices. Christoph Hellwig disagreed, pointing out that SPDX describes licensing but not ownership:

It is not a replacement for the copyright notice in any way, and having been involved with Copyright enforcement I can tell you that at least in some jurisdictions Copyright notices absolutely do matter.

Hellwig, of course, was the initiator of a GPL-infringement lawsuit against VMware that was dismissed due to an inability to prove ownership of the code in question. It is thus unsurprising that he is sensitive to the placement of copyright notices in the code itself. When Hellwig submitted a patch of his own, also in September, that added a copyright notice to a newly created file, Sterba let it be known that he would refuse that change as well. Toward the end of October, in the discussion of yet another patch set, Hellwig eventually withdrew the work, saying:

FYI, I object to merging any of my code into btrfs without a proper copyright notice, and I also need to find some time to remove my previous significant changes given that the btrfs maintainer refuses to take the proper and legally required copyright notice.

Given that the kernel code has no shortage of copyright notices (nearly 79,000 lines contain the word "copyright"), it is natural to wonder why this policy is being applied in the Btrfs subsystem. The Btrfs wiki page describes the reasoning:

The copyright notices are not required and are discouraged for reasons that are practical rather than legal. The files do not track all individual contributors nor companies (this can be found in git), so the inaccurate and incomplete information gives a very skewed if not completely wrong idea about the copyright holders of changes in a given file. The code is usually heavily changed over time in smaller portions, slowly morphing into something that does not resemble the original code anymore though it shares a lot of the core ideas and implemented logic. A copyright notice by a company that does not exist anymore from 10 years ago is a clear example of uselessness for the developers.

The page also states that the Signed-off-by tags found in the kernel's Git history are sufficient to document the copyright status of the code. There are a few difficulties with this position, including the fact that those tags indicate that the submitter has the right to contribute the code to the kernel, but do not necessarily show who the copyright owner is. Another problem was pointed out by Bradley Kuhn: if the Git history serves as the copyright notices for the code, then it will be necessary to ship the entire Git repository to be in compliance with the GPL's source-code requirements. That makes complaints about copyright notices in the code being unwieldy lose some of their weight.

In the most recent discussion, Chris Mason said the "Christoph's request is well within the norms for the kernel". Sterba replied that he would consider changing the policy, but only as part of a wider policy decision by the kernel project:

I've asked for recommendations or best practice similar to the SPDX process. Something that TAB can acknowledge and that is perhaps also consulted with lawyers. And understood within the linux project, not just that some dudes have an argument because it's all clear as mud and people are used to do things differently.

It's not clear who Sterba has asked for recommendations at this point. Chances are that he will find, over time, that the Btrfs subsystem's position on copyright notices is not widely held across the project as a whole. Steve Rostedt arguably described the consensus view: "The policy is simple. If someone requires a copyright notice for their code, you simply add it, or do not take their code". In the absence of a decree from Linus Torvalds, though, the issue of copyright notices may continue to be a source of disagreement. Claiming copyright on a portion of a shared body of code can always be a touchy matter, but it's one that developers can care a lot about.

Comments (37 posted)

Still waiting for stackable security modules

By Jonathan Corbet
October 31, 2022

The Linux security module (LSM) mechanism was created as a result of the first Kernel Summit in 2001; it was designed to allow the development of multiple approaches to Linux security. That goal has been met; there are several security modules available with current kernels. The LSM subsystem was not designed, though, to allow multiple security modules to work together on the same system. Developers have been working to rectify that problem almost since the LSM subsystem was merged, but with limited success; some small security modules can be stacked on top of the "major" ones, but arbitrary stacking is not possible. Now, a full 20 years after security-module support went into the 2.5 development kernel series, it looks like a solution to the stacking problem may finally be getting closer.

The challenge

The early thinking was that an LSM would enforce a security policy on the entire system, and that there would be only one of them. The fact that the only existing LSM for several years was SELinux helped to reinforce that belief, but developers quickly realized that there could be good reasons to run multiple LSMs on a system. A proper stacking scheme would, for example, make it possible to use a variety of small LSMs, each of which is aimed at a piece of the security policy. More recent developments, such as containers, have increased the number of settings where even having multiple full-system modules loaded might make sense.

There has been no shortage of attempts to solve this problem. Some of those that were covered here over the years include:

Serge Hallyn may have made the first attempt in 2004.
David Howells in 2011.
Casey Schaufler in 2012.
Schaufler again in 2015.
Yet again in 2019.

Anybody who wants to solve this particular problem is going to have to face a number of challenges. One of those is deciding whether to allow an operation if there are multiple active LSMs and they disagree with each other. The simplest approach there is to give any LSM veto power; all modules that express an opinion on any specific operation must agree to allow it, or it will be denied. The hardest problems may well be elsewhere. Figuring out what the user-space interfaces should look like when multiple LSMs are active is not straightforward; tracking down policy problems can be painful even when there is only one module in the mix.

Another significant problem is giving LSMs the means to attach their own metadata to objects in the system. The original LSM patches handled this by adding pointers to various kernel data structures, but no provision was made for the problem of multiple modules needing to store data. Any solution has to allow LSMs to cooperate in this regard as well while, at the same time, not having a measurable effect on performance.

A viable solution?

Schaufler does not lack for persistence; ten years after starting on this project, he is still trying to get a solution for security-module stacking that addresses these problems into the mainline kernel. Version 38 of his stacking patch set was posted in late September; it does not solve the entire problem, but it does make it possible to stack the AppArmor LSM with any other module. After all those years and versions, it might not be surprising to learn that Schaufler is ready to see this work merged; back in August, he asked whether that could happen during the 6.1 kernel cycle:

I would like very much to get v38 or v39 of the LSM stacking for Apparmor patch set in the LSM next branch for 6.1. The audit changes have polished up nicely and I believe that all comments on the integrity code have been addressed. The interface_lsm mechanism has been beaten to a frothy peak.

This plan was complicated by an independent event, though: longtime LSM maintainer James Morris stepped aside and Paul Moore took over the maintainership of that subsystem. This change arguably had both positive and negative effects with regard to the stacking patches. On the positive side, Moore appears to have more time to engage with the stacking patch set and a stronger desire to see it merged into the mainline. Less positive, at least with regard to a quick merging of the patches, is that Moore felt the need to re-review the patch set from the beginning, which inevitably led to comments and requests for changes.

Specifically, Moore was unhappy with the user-space API, which is an extension of the existing, /proc-based interface that even Schaufler described as "hideous". Moore suggested that perhaps the time had come to add a set of LSM-specific system calls instead:

We have avoided this in the past for several reasons, but over the past couple of decades the LSMs have established themselves as a core part of Linux with many (all?) major Linux distributions shipping and supporting at least one LSM; I think we can justify a handful of well designed syscalls, and with Landlock we have some precedence too.

Moore laid out a rough design for the system-call API that he had in mind as well. Schaufler was less than pleased with this idea, though:

I wish you'd suggested this three years ago, when I could have done something with it. If stacking has to go on a two year redesign because of this it is dead. We've spent years polishing the /proc interfaces. Changed the names, the content, even bent over backwards to accommodate the security module that refused to adopt an attr/subdir strategy.

User-space interfaces can be exceedingly difficult to change once they have been included in a kernel release; if significant changes are required, they usually need to happen before the code is merged. So it is not entirely surprising that Moore was insistent, saying that he could not accept the proposed interface; Schaufler eventually threw in the towel and started discussing what he needed to do:

OK, so what interfaces need to be redone? I have been polishing what's just become a turd for a %^&*(ing long time. I need to know whether it is something I can address, or whether I just toss the entire thing in the proverbial bit bucket.

The system-call API

Schaufler eventually came back with a proposal for two new system calls. The first of those is:

    struct lsm_ctx {
	unsigned int		id;
	unsigned int		flags;
	__kernel_size_t		ctx_len;
	unsigned char		ctx[];
    };

    int lsm_self_attr(struct lsm_ctx *context, size_t *size, int flags);

Here, context is a buffer that is *size bytes in length; the flags argument must be zero. This call will return all of the attributes assigned to the calling process by the security module(s) currently in force, in the buffer pointed to by context; this patch describes the format of the returned data. The size parameter will be updated with the actual size of the returned data. The second system call can be used to determine which LSMs are currently active:

    int lsm_module_list(unsigned int *ids, size_t *size, unsigned int flags);

This call will fill the ids array with the ID numbers assigned to each of the active modules. These ID numbers are defined in a new header file that is intended to be a part of the user-space API; Schaufler's Smack module, for example, is defined as:

    #define LSM_ID_SMACK	34

Much of this design follow Moore's initial suggestions. It appears to be mostly uncontroversial — with one significant exception. Tetsuo Handa, a developer of the Tomoyo LSM, has vociferously and repeatedly objected to the use of integer module IDs assigned within the kernel code itself. This practice will, he has argued, make it impossible to use run-time loadable LSMs that are not currently part of the kernel source. As a result, it will be hard for developers of LSMs to test them or (especially) get others to work with them. That, in turn, spells a "death sentence" for any new LSMs in the future, he said.

As others have pointed out, there are a few problems with this argument, starting with the fact that the kernel-development community has never gone out of its way to make life easier for out-of-tree code. Another is that LSMs, whether in-tree or not, cannot be loaded at run time now. That capability was removed many years ago and seems unlikely to return; among other things, it is too easy for LSMs to bypass the restrictions normally applied to kernel modules. For this reason, Handa's request to simply export the security_hook_heads variable to kernel modules is unlikely to be viewed favorably. Schaufler has also said repeatedly that any new mechanism for loadable LSMs would have to be treat those modules quite differently than built-in LSMs, since loadable LSMs would have to be more severely restricted. That is another big job that he personally has no intention of taking on.

For all of these reasons, Handa's objections seem unlikely to prevail in the end. But this work, which has had such a turbulent history for so long, may still not be merged immediately. New system calls require extensive review, and that process has just begun; it wouldn't be surprising if more changes were called for. Even so, the end of the process for limited LSM stacking may be getting closer. Then all that is left is "universal stacking", a prospect that, according to Schaufler, is "at least a year off". There is visible progress, but this lengthy discussion is not yet finished.

Comments (14 posted)

Packaging Rust for Fedora

By Jonathan Corbet
October 28, 2022

Linux distributions were, as a general rule, designed during an era when most software of interest was written in C; as a result, distributions are naturally able to efficiently package C applications and the libraries they depend on. Modern languages, though, tend to be built around their own package-management systems that are designed with different goals in mind. The result is that, for years, distributors have struggled to find the best ways to package and ship applications written in those languages. A recent discussion in the Fedora community on the packaging of Rust applications shows that the problems have not yet all been solved.

The initial spark for the discussion was this Fedora 38 change proposal driven by Panu Matilainen. The RPM package manager has long carried its own internal OpenPGP parser for the management of keys and signatures for packages. This parser seemingly pleases nobody; the proposal describes it as "rather infamous for its limitations and flaws" and puts forward a plan to replace it with the Sequoia library, which is written in Rust (and which was covered here in 2020). The use of Rust provides the sort of safety net that is welcome in security-relevant code like this, but it can also be a red flag for developers who worry about how Rust fits into the distribution as a whole.

Inevitably, there were complaints about this proposal. Kevin Kofler, for example, asked why a library written in C had not been chosen. According to Matilainen, efforts to find such a library have been underway for years without success. The most obvious alternative, GPGME, is unsuitable because it is built around communicating with an external GPG process, "which is a setup you do NOT want in the rpm context where chroots come and go etc.". Neal Gompa agreed that the GPGME model creates pain in this context, and seemed to agree that there was no better alternative than Sequoia despite his own disagreements with the Rust community. "So here we are, in a subpar situation created by bad tools because nobody cares enough about security anyway".

Kofler went on to outline his problems with the Rust language. One of those was simply that it's yet another language to deal with, a complaint that didn't draw a lot of sympathy on the list. His other objection, though, struck closer to home:

The worst issue I see with Rust is the way libraries are "packaged", which just implies installing source code and recompiling that source code for every single application. (And as a result, the output obviously gets statically linked into the application, with all the drawbacks of static linking.) I consider a language with no usable shared library support to be entirely unpackageable and hence entirely useless.

Fabio Valentini, who works on packaging Rust crates for Fedora, pointed out that Sequoia is implemented as a shared library with a C ABI, so there will be no need to statically link any Rust code into RPM. He asked Kofler for any constructive suggestions he might have for improving the situation; that request was not addressed in Kofler's response. Fedora project leader Matthew Miller did have some thoughts, though.

Specifically, he agreed with Kofler that Rust applications may, in the end, just be "unpackagable". He mentioned his efforts with the Bevy game engine; he found that invoking the Cargo build system to obtain Bevy's build dependencies will fetch no less than 390 separate crates, about half of which are not currently packaged for Fedora. Trying to package such an application is sure to be painful but, he said, "this is what open source winning looks like". Cargo makes it easy to share and reuse software components, which is a great benefit, but it makes packaging all of those dependencies independently much harder. The fact that many of those dependencies are on specific versions of the crates involved makes the task harder yet.

All of this has led him to question the value of the work that is going into packaging Rust crates for Fedora. Instead, he said, Rust could be an opportunity to explore different approaches. "Something lightweight where we cache crates and use them _directly_ in the build process for _application_ RPMs". The implication was clearly that, by not trying to package all of the dependencies or ship dynamically linked executables, Fedora could work more directly with the Cargo ecosystem, save a lot of work that is (to him) of dubious value, and more easily get applications out to users. Fedora, he concluded, needs to adapt to remain relevant in the current development environment.

Few readers are likely to be surprised by the news that Valentini disagreed with this point of view. Bevy, he said, is a bit of a special case; most Rust applications are relatively easy to package for Fedora because the most popular crates are already packaged. Cargo and RPM, he added, work in similar ways, making the packaging job easier; in many cases, the RPM spec file can be generated automatically from the Cargo metadata. Meanwhile, the packaging effort brings all of the usual benefits, including cross-architecture testing, code and licensing review, and upstream contributions to make packaging easier in the future.

Trying to change the packaging process for Rust applications will, he said, make things worse instead. That is what happened with both Node.js and Java, he said (some of the Java discussions were covered here in June). Overall, he concluded, the situation with Rust is relatively good, and trying to do something other than "plain RPM packages" is likely to create more problems than it solves.

Kofler, instead, decried the ease with which Rust allows the addition of dependencies, calling the result "dependency hell". Rather than Fedora adapting to Rust, he said, Rust is going to have to adapt to become more relevant to Linux distributions. Gompa was not optimistic about that happening, though, saying that his efforts in that direction had met significant resistance in the past.

The conversation wound down at that point without any definitive conclusions. There is one relevant point that wasn't addressed that is worth considering, and which is highlighted by the use of Sequoia in RPM. Language-specific environments can work nicely as long as the developer sticks with the language in question; they can fall down when faced with the need to combine code written in multiple languages. At that point, the distribution model, which tries to make all packages work well together, shows its value. Given that the Rewrite The World In Rust Project is destined to take years to reach its conclusion, it seems likely that the number of mixed-language applications will increase for some time, and distributors will need to be able to package and ship those applications.

For the time being, the packaging of Rust crates for Fedora seems likely to continue without significant changes. But the topic of the intersection between distribution and language-specific package managers seems destined to reappear regularly for the indefinite future. Finding a way to make these independent ecosystems interact more smoothly will not be easy, but it would be beneficial to all involved; it is a problem worth working on.

Comments (185 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>