Kernel development

Brief items

Kernel release status

The current development kernel remains 3.6-rc3; no new -rc releases have been made in the past week.

Stable updates: the 3.0.42, 3.4.10, and 3.5.3 stable updates were released on August 26 with the usual pile of important fixes. There are some reports of Intel graphics problems with 3.0.42 and 3.4.10, so users may want to proceed carefully with those.

Comments (none posted)

Quotes of the week

Somebody is trying to kill all the kernel developers.

First we had two earthquakes - fine, this week God not only hates republicans, but apparently us kernel developers too. But we kernel developers laugh in the face of danger, and a 5.5 earthquake just makes us whimper and hide in the closet for a while.

But after we've stopped cowering in the closet, there's a knock on the door, and the conference organizers are handing out skate boards, with the innocent explanations of "We're in San Diego, after all".

If that's not a sign of somebody trying to kill us, I don't know what is.

— Linus Torvalds

Regarding Linux, obviously Linus Torvalds isn't enforcing his own copyrights through [the Software Freedom] Conservancy, but I recently asked Linus at a party: "Are you mad at me for doing GPL enforcement for Linux?" Linus answered: "No. In fact, part of the reason I didn't require copyright assignment for Linux was because I want individuals to decide what they want to do with their copyrights." Many Linux copyright holders work with Conservancy on enforcement, following Linus' directive that they should make their own decisions about their copyrights.

— Bradley Kuhn

The file feature-removal-schedule.txt is ignored by most people except for people that add to it. It's more of a global TODO list for developers than being anything useful by anyone.

Add a feature removal of removing the feature-removal-schedule.txt.

— Steven Rostedt

Comments (1 posted)

Long-term support for the 3.2 kernel

Ben Hutchings announces: "I intend to maintain Linux 3.2.y for at least as long as Debian 7.0 is supported. The end-of-life for each Debian release is 1 year after the following release, so this would probably be around the end of 2015."

Full Story (comments: none)

Mourning Doug Niehaus

Thomas Gleixner has sent us an extended obituary for Doug Niehaus, who passed away on August 21. "Doug has to be considered one of the pioneers of real-time Linux. His efforts of making Linux a venerable choice for real-time systems reach back into the mid 1990s. While his KURT (Kansas University Real-Time) project did not get the attention that Victor Yodaikens RT-Linux development received for various reasons, his influence on the Linux kernel and on today's most popular Linux Real-Time project reaches much farther than most people are aware of."

Full Story (comments: 10)

Kernel development news

The 2012 Kernel Summit

By Michael Kerrisk and Jake Edge
August 29, 2012

The 2012 Kernel Summit was held in San Diego, CA, USA, over three days, 27-29 August. As with the 2011 Kernel Summit in Prague (and following on from discussions at the 2010 Kernel Summit), the 2012 summit followed a different format from the ten previous summits. For 2012, the event took the form of an invitation-only plenary-session day followed by two days of minisummits and additional technical sessions shared with the co-located 2012 Linux Plumbers Conference that kicked off on 29 August; the agenda for days 1 and 3 can be found here. (The ARM minisummit was something of an exception to this format: it ran for two days, starting on the same day as the plenary sessions.)

Main summit, day 1

The first day of the Kernel Summit, on 27 August, consisted of plenary sessions attended by around 80 invitees. Among the topics were the following:

The future of kernel regression tracking; the kernel development community is in strong agreement on the value of regression tracking, and is currently looking for some person(s) to take up this high-profile work.
Supporting old/oddball architectures, tool chains, and devices: how long must we support ancient hardware and software, and how do we leave it behind?
Regression testing; how can we do a better job of finding bugs before they bite users?
Distributions and upstream; what can kernel developers do to make life easier for their main customers — the distributors?
Lightning talks: quick sessions on RCU callbacks and Smatch.
Kernel build and boot testing; a new framework for quickly finding regressions.
Android upstreaming: the ongoing process of getting the Android kernel code into the mainline.
Improving the maintainer model; do our subsystem maintainers scale?
Stable kernel management; how is the stable process working?
Tracing and debugging, and how to get better oops output in particular.
Linux-next and related improvements to the development process.

Main summit, day 2

The memcg/mm minisummit covering a wide range of topics related to memory management.

Main summit, day 3

Module signing; toward a way to finally get this feature into the kernel.
Kernel summit feedback; how did the event work out this year, and what changes should be made for future years?

ARM minisummit, day 1

The first day of this year's Kernel Summit coincided with day one of the ARM minisummit. Given that the "minisummit" spanned two days, there was talk of false advertising, but there was lots to cover.

Secure monitor API: how best to support the secure monitor mode across a wide variety of processors.
Stale platform deprecation: some ARM platform support has clearly not been used for years; how do we clean out the cruft?
Virtualization is coming to ARM, but brings some issues of its own.
DMA mapping has seen a lot of work in the last year, but there is still a fair amount to be done.

ARM minisummit, day 2

Process review for the arm-soc tree: how well is this tree working toward the goal of cleaning up the ARM architecture code?
Toward a single kernel image: what needs to be done to get a single kernel that boots on multiple ARM processor families?
AArch64: the current status of 64-bit support for the ARM architecture.
A big.LITTLE update; how can the kernel support this novel architecture?
DMA issues and how to best support generic DMA engines in particular.

Linux Security Summit

Secure Boot: keynote from Matthew Garrett.
Secure Linux containers: using SELinux to create sandboxed containers.
Integrity for directories and special files: extending the Integrity Measurement Architecture (IMA) to handle directories and other special files.
DNSSEC: a look at the "cryptographically secured globally distributed database" for domain names and more.
Security modules and RPM: expanding the hooks in RPM to support Smack and other security technologies.
Kernel security subsystem reports: reports from subsystem maintainers.

Notes from others

PCI minisummit, notes posted by Bjorn Helgaas.
ARM minisummit, posted by Will Deacon.
Media workshop notes, part 1 by Mauro Carvalho Chehab.
The realtime microconference from LPC, courtesy of Darren Hart.

Acknowledgments

Michael would like to thank the Linux Foundation for supporting his travel to San Diego for this event; Jake would like to thank LWN subscribers for the same.

Comments (11 posted)

KS2012: The future of kernel regression tracking

By Michael Kerrisk
August 28, 2012

2012 Kernel Summit

For several years, Rafael Wysocki tracked regressions in the kernel, producing lists and statistical analysis of regressions in each kernel release. This task, which provided an (imperfect) measure of the increase or decrease in the quality of successive kernel releases, was considered so valuable by other kernel developers that Rafael was regularly invited to present his observations at the Kernel Summit (2008, 2009, 2010, and 2011). However, his presentation on this topic on the first day of the 2012 Kernel Summit had a rather different flavor, asking his peers what might be the future of regression tracking.

Over time, Rafael has steadily moved to working on other tasks in the kernel, and has had less time for regression tracking. Fortunately, for a time, a couple of other people stepped in to assist with the task of creating and maintaining the kernel Bugzilla reports that were used to track regressions. However, this work did not run all that smoothly. Rafael had already noted on previous occasions that the kernel Bugzilla was not well suited to the task of generating lists of kernel regressions. In addition, as Rafael stepped still further back from regression tracking, there seemed to be some differences of understanding between his successors and various kernel developers about the how the Bugzilla should be used to track regressions. (Of note is the fact that Rafael was using Bugzilla bugs merely as a tool to track and measure regressions; whether kernel maintainers made use of those bugs as part of their work in fixing those regressions was a matter left to the maintainers.) These differences in understanding appear to be one of the reasons that Rafael's successors also stepped back from the task of regression tracking.

Which brings us to where we are today: for nearly half a year, there has been no tracking of kernel regressions. Furthermore, Rafael noted that his other commitments meant that he would not have time to return to this task in the future. This led him to ask a simple question: do we want track kernel regressions?

At this point, many kernel developers spoke up to emphasize how valuable they had found Rafael's work. H. Peter Anvin, for example, noted that he is not a fan of Bugzilla, "But, I found the lists of regressions useful. It made me do things I didn't want to do." Linus Torvalds also noted that he loved the kind of overview that Rafael's work provided to him.

The session digressed into a variety of other topics. Rafael wondered whether the Bugzilla is even very useful as a tool for tracking and resolving kernel bugs. Responses from various developers showed that use of Bugzilla varies greatly across subsystems, with some subsystems relying on it heavily, while others avoid it in preference to mechanisms such as email. James Bottomley made the point that Bugzilla allows people unfamiliar with mailing lists to fire a bug onto Bugzilla, which then (automatically) appears on a mailing list. Bugzilla thus provides those users with a means to interact with the kernel developer workflow. Later in the session, the topic of Bugzilla versus mailing lists led Rafael to raise another concern: when some subsystems use Bugzilla while others use mailing lists or other mechanisms, what should the kernel developer community tell bug reporters about how to report bugs? That question often forms a difficult first hurdle for bug reporters. Unfortunately, there was little time to delve into that topic.

There was some general discussion about how Bugzilla should be used to track regressions, and whether there might be better tools than Bugzilla for this task, but no concrete alternatives were proposed. In the end, it was agreed that the question of tooling is secondary, and the tool choice might best be left to whomever takes on the task of regression tracking. The main point was that there was widespread consensus in the room that developers would like to see the regression tracking list return, and that the top priority was to find a person (or, possibly, several, so as to avoid overloading one individual and to ensure continuity when people are absent for vacations and so on) who would be willing to take on this task.

At this point then, there's a vacancy for one or more kernel regression trackers. Although the work is unpaid, regression tracking is clearly a task that is highly valued by many kernel developers, and as Rafael's experience shows, when the work is done in a way that matches with the development community's needs, the role has a high profile. (Interested volunteers should contact Rafael.)

Comments (12 posted)

KS2012: Supporting old/oddball architectures, tool chains, and devices

By Michael Kerrisk
August 29, 2012

2012 Kernel Summit

H. Peter Anvin began his presentation on the first day of the 2012 Kernel Summit by noting that Linux supports an astonishing variety of hardware. In particular, it offers long-term support for older hardware, and is popular with some users for precisely that reason. Most of the time, supporting older and unusual hardware and tool chains is a worthwhile activity. Nevertheless, Peter's question was: what limits does the kernel development community want to set on the amount of effort invested in supporting old or unusual hardware architectures, build tool chains, and devices?

Peter's point is that supporting old and oddball systems inevitably has associated effort and problems. When it comes to supporting these systems, there is a balancing act: those systems may matter a lot to a small set of users, but supporting them imposes a development burden across a wide spectrum of the kernel development community.

Peter gave a few examples. The kernel still includes code to support old Intel 386 CPUs (i.e., the predecessor to 486 chips that appeared in 1989). One might ask whether anyone out there is still trying to boot Linux on such old hardware. However, Peter noted that as an x86 maintainer he had recently received a bug report from someone trying to boot Linux on such a system. But the genesis of that bug report is interesting. In essence, it came from someone who asked themselves, would I be able to boot a modern Linux kernel on this ancient system that I happen to have lying around? In response, Peter's question is, should the kernel developer community continue to invest effort to support such users?

There are various difficulties associated with supporting old and obscure systems. For example, maintainers may want to make kernel changes that could impact code that supports legacy systems, but it may be difficult to assess what that impact may be. Furthermore, it's unlikely that a maintainer will have the hardware needed to test the impact of changes that affect legacy code. Legacy code thus becomes rarely executed code, and rarely executed code is code that hides breakages and security holes. (Peter noted that architecture-specific "hooks" are the "common" solution for dealing with oddball architectures. Such hooks are a form of "come from" code about which it is very difficult to reason. The only way of dealing with that difficulty is careful documentation of the hook preconditions, postconditions, and invariants. It may come as no surprise that the likelihood that such documentation exists falls "somewhere between rarely and never".)

There are a few questions to consider when deciding whether to continue to support a legacy system. For example, do users of that system exist, how numerous are they, and how important is it to continue supporting them with modern kernels? The problem is that it is difficult to obtain data that answers these questions for obscure systems. In particular, Peter questioned the notion that bug reports equate to active users.

Legacy build tool chains present a similar problem. Peter notes that "People complain when we introduce something that triggers a bug in a 7-year-old tool chain. At some point we need to be able to say: if you want to use a new kernel, you've got to be prepared to install an appropriate tool chain."

Peter extended his discussion of legacy support to include user-space interfaces. One example he gave was compatibility code that has been unused for a long time. As a specific case, a.out support on x86-64 has been the source of several serious security holes; however, the a.out executable format was already largely obsolete by time the x86-64 architecture arrived, and likely has never seen serious use. Is there any good reason to support such obsolete interfaces, and is there a sane path to deprecating and removing them?

At this point, Linus spoke up to detail his stance on changes to user-space interfaces: "I never said don't change the ABI. We have done it to fix bad bugs. Just don't break user space. If we can change the ABI, but no one complains, then we are good to go." As a step in the removal process, Rusty Russell proposed the addition of a CONFIG_MODERN option that would be on by default, and could be disabled if access is needed to legacy features. Len Brown responded that this would likely suffer a similar fate to CONFIG_EXPERIMENTAL, which was always turned on (i.e., CONFIG_MODERN would likely always be turned off by distributors). Thomas Gleixner instead suggested to place legacy features under configuration options marked as "bool n", preventing the options from being selected, and then removing the legacy features unless someone complains within a reasonable time.

The session ended without any particular conclusions, but a few people expressed sentiments in favor of removing support for legacy systems, and no one raised serious objections. Thus, it may be that in the future we see some serious effort devoted to cleaning out kernel code for systems that clearly see no serious contemporary use. (It is noteworthy that a discussion on removing stale platforms was also taking place in the ARM minisummit.)

Comments (9 posted)

KS2012: ARM: Secure monitor API

By Jake Edge
August 29, 2012

2012 Kernel Summit

Will Deacon led the first KS2012 ARM minisummit session on creating a common API for the secure monitor mode of ARM processors. This security feature—part of the ARM TrustZone extensions—uses a secure monitor call (SMC) instruction to switch between secure and non-secure modes. It can be used to implement digital rights management (DRM), secure payment, and more. The secure monitor also provides services for booting and idling the processor. Currently, Linux has no common API to access secure mode, so various ARM platforms implement things differently. The current situation is painful, Deacon said.

What's needed, Deacon suggested, is a common API that could be used by any ARM system-on-a-chip (SoC) that needed it. Samsung has proposed an API that handles the boot and idle SMCs, which could be used as a starting point. But, handling SMCs is still likely to require some amount of board-specific code.

For some ARM boards, there are SMCs that need to be done in the early boot assembly code. There are various SoC errata that need to be worked around via SMCs early in the boot process. It would be preferable if calling the SMCs could be pushed into device drivers, but that may not be possible for all processors.

There are also some services provided via SMC, like video and audio format handling, which need to be accounted for. Those kinds of services could be described in the device tree for the processor if there is some agreement on how to create those descriptions. That would allow access to those services via drivers using the common API.

Deacon suggested that there was no need to solve all of the problems "in one go", and that focusing on the things that could be done from drivers would be the place to start. There will still need to be platform-specific code to handle set up that needs to be done before the MMU is enabled, as well as to handle quirks of some of the platforms.

Another problem is that the actual calls into secure mode are not standardized, and there are already multiple existing implementations. The differences can be in the mapping from a number to the actual SMC it corresponds with (somewhat akin to system call numbers). The parameters can also be different. Those differences could be described in the device tree for the platform, and used by the common SMC framework. Actually invoking an SMC would stay in the platform-specific code.

There is a recent recommendation from ARM to standardize the SMCs, but it is just that: a recommendation. That document also describes how to use SMCs for doing thermal and power management, so the common API could eventually incorporate some or all of those kinds of calls, but they are just recommendations that may or may not catch on.

By starting with what can be done in the C code from drivers, at least a partial solution to the "complete mess" that exists today can be achieved. Starting with the cpu_boot() and do_idle() hooks that the Samsung API provides, then adding additional SMCs as needed, will start cleaning up that mess.

Comments (1 posted)

KS2012: ARM: Stale platform deprecation

By Jake Edge
August 29, 2012

2012 Kernel Summit

There are a handful of "minor" ARM platforms in the mainline that haven't been touched for as many as five years, Olof Johansson said in kicking off a discussion of perhaps deprecating some of those platforms. The code for those platforms gets updated whenever there are sweeping changes, but they may not have been tested in years. They build just fine, but no active developers have the hardware to actually test them. He wondered when or if those platforms can ever be removed from the tree.

He suggested that once device tree has been proven as a solution for reining in the explosion of board files in the ARM tree, perhaps a one or two-year deadline could be set. Those platforms that don't update to use device tree could then be dropped.

There are still lots of older ARM platforms that are supported; OMAP 1 was cited as an example. Even though some in the discussion were a bit skeptical about older chips running mainline kernels, that does happen. There are hobbyists or others who keep the older chips working. Thus, those are not targets for deprecation.

But, Tony Lindgren noted that OMAP 2 has some 30 different board files and that there is "no way" those can all be converted to device tree. Arnd Bergmann suggested that in cases like that, the drivers should be converted to work with device tree and the board files should just be removed from the tree. Users of those platforms can either continue using older kernels or create device tree descriptions using the updated drivers.

That may not be possible in all cases. Lindgren mentioned that ARM maintainer Russell King has an automated board test setup that could be affected if those board files are removed. On some platforms, it may require major work to get power management and other features working with device trees. For legacy boards, it is unlikely that anyone will actually do that work.

There is a question of how to decide which board files should be deprecated, Lindgren said. A list of proposed deprecations should be created. Some kind of tool that uses Git to find which platforms have not been updated recently would be useful here.

In the discussion that followed, several different boards were mentioned as candidates for deprecation. Some participants spoke up for specific boards or noted that King used them in his testing. That led to a joking suggestion that someone find a way to surreptitiously relieve King of the burden of some of those boards.

The bcmring platform is particularly problematic because it was completely broken for roughly two years, but has recently been picked up by some hobbyists (who happen to work for Broadcom, creator of the platform). It has a "horrible" OS abstraction layer, so it doesn't really belong in the mainline in its present form. Will Deacon suggested that perhaps the platform could be moved to the staging tree; a checklist could be provided for what is needed before it would be acceptable in the mainline.

For defconfig files, Johansson proposed that "superset configurations" be created, which turn on every driver and feature that could be present on a platform. Those can then be "whittled down" by board or SoC vendors as needed. That will help increase the amount of build testing. Bergmann agreed, saying that having 30 defconfigs was not really a problem, even if only five are being used in practice. That would be a big improvement over the 130 or more defconfigs that are currently in the tree, Johansson said.

Specific platforms that were mentioned as targets for deprecation included ks8695, h720x, l7200, netx, w90x900. In addition, ixp4xx was targeted for deprecation, but that may still be a ways off.

Comments (6 posted)

KS2012: ARM: Virtualization

By Jake Edge
August 29, 2012

2012 Kernel Summit

The next KS2012 ARM minisummit session discussed the virtualization work that has been going on for ARM. Both KVM and Xen are under development for ARM, but neither has gotten to the point of being merged. Marc Zyngier gave an overview of the KVM status, while Stefano Stabellini reported on Xen.

Zyngier began by noting that virtualization extensions were added to the most recent revisions of the ARMv7 architecture. There is now a hypervisor mode in the processor, which runs at a higher privilege level than the OS.

For KVM, physical interrupts are handled by the host, with guests only seeing virtual interrupts. That stands in contrast to Xen where certain physical interrupts are delivered to the guests, as Stabellini reported. According to Olof Johansson, the virtualization model provided by ARM fits the Xen hypervisor-based virtualization better than KVM's kernel-based model.

Paul Walmsley asked about vendors who were using the hypervisor mode for doing cluster switch operations, and wondered how well that would work with KVM. Zyngier said that it would work "badly", because KVM and the cluster code would "fight" over hypervisor mode; whoever got there first would win. Will Deacon noted that those who wanted to run KVM on their systems would need to move the cluster code to a higher level.

In answer to a question from Magnus Damm, Zyngier said that KVM on ARM would not support virtual machine nesting. It also would not support the emulation of other CPUs, so the guest CPU must match that of the underlying hardware. The QEMU developers have decided that the work necessary to do that emulation was not worth the trouble, as one of the participants reported.

The KVM guests run at privilege level 1 (PL1), which is the level used for normal kernels, but the host kernel runs at PL2. That means that switching between guests requires lots of transitions, from PL1 to PL2, then back to PL1 for the switched-to guest (and possibly to a lower privilege level depending on what the guest is running).

Guests get preempted whenever a physical interrupt occurs, but the guests never see those, Zyngier said. A stage 2 page table is populated by the host for each of the guests, and the host has a stage 1 page table. There are no shadow page tables. Guests can also be preempted when pages need to be faulted in via the stage 2 page tables.

Devices are mapped into the guests. The virtual CPU interface—part of the ARM generic interrupt controller (GIC)—is mapped in as well. It is believed that all devices can be mapped into the guests, but that has yet to be tried. Because of that, the same kernel can be used for both host and guests. Stabellini noted that the same is true for Xen, which is unlike the x86 situation.

Caches and TLB entries are tagged with an 8-bit virtual machine ID (VMID). Guests are not aware that there are no physical devices, they just poke what they think are hardware registers, a stage 2 translation is done, and the data is forwarded on to the hardware. These memory-mapped IO devices are emulated by QEMU.

Interrupts are injected into the guest by manipulating bits on the guest stack to indicate an interrupt. Xen, on the other hand, uses a "spare" interrupt to signal events to the guest. There is some concern that there is no real guarantee that there is always a free interrupt number to be used. Right now, Xen uses a fixed interrupt number, but that will likely change.

In order to boot a KVM host, the kernel must be started in hypervisor mode. That requires a KVM-compliant bootloader. When booting, a very small hypervisor is loaded, whose "only purpose in life is to be replaced". It has a simple API with just two calls, one to return a pointer to the stub itself, and one to query whether hypervisor mode is available. Zyngier said that he believes Xen could also use that hypervisor stub if desired. One possible problem area is that some "other payloads" (alternate operating systems) may not be able to handle being started with hypervisor mode on, so there may need to be a way to turn it off in the bootloader, Johansson said.

In contrast to KVM, Xen is a hypervisor that sits directly on the hardware, Stabellini said. Everything else is a guest, including Linux. All of the guests are fully aware that they are running on a hypervisor. Xen for ARM assumes that the full virtualization extensions are present and that nested page tables are available. Zyngier noted that KVM makes the same assumptions.

The Xen ARM guest is based on the Versatile Express board, but with far fewer devices defined in the device tree. The Xenbus virtualized bus is used to add paravirtualized devices into the guest. QEMU is not used, so there is no emulated hardware.

Xen ARM is "completely reliant" on device tree, Stabellini said. His biggest worry is that device tree might go away for ARM as he has heard that ACPI may be coming to ARM. The problem there is that the ACPI parser is too large to go into the Xen hypervisor (it roughly doubles the code size). Parsing device trees is much easier, and requires much less code, so trying to do the same things with ACPI "would be a nightmare".

Johansson pointed out that the decision about ACPI would not be made by Linux developers or ARM; there is a large company in Washington that will determine that. For power management on some devices, ACPI handling may be required. But, as Zyngier said, adding ACPI to ARM does not mean the death of device tree.

The governance of ACPI is closed now, and that needs to change so that the ARM community can participate, one participant said. According to Arnd Bergmann, embedded systems will not be moving to ACPI any time soon, but there is a real danger that it will be present on server systems. ARM devices that are targeted at booting other OSes will be using UEFI, which can pass the device tree to the kernel in the right format, he said.

The ARM Xen hypervisor is almost fully upstream in the Xen tree at this point. The Linux kernel side has been posted, and is not very intrusive, Stabellini said. The patches to the kernel are mostly self-contained, with only small changes to the core.

Another concern was the stabilization of the device tree format. If that changes between kernel releases, there can be a mismatch between the device tree and the kernel. Bergmann said that kernel developers are being asked to ensure that anything they add to the device tree formats continues to work in the future, while firmware developers are being warned not to assume a given device tree works with any earlier kernels. Once all main platforms have been described with device trees, there will be an effort to ensure that those don't break in the future, he said.

Comments (7 posted)

KS2012: ARM: DMA mapping

By Jake Edge
August 29, 2012

2012 Kernel Summit

In the last discussion on day one of the 2012 ARM minisummit, Marek Szyprowski gave a status update on changes in the ARM DMA subsystem over the last year. There has been a lot of work in that time, with most of it having been merged in 3.5. The most important change is the conversion to dma_map_ops, which provides a common DMA framework that can be implemented as needed for each architecture. It allows for both coherent and non-coherent devices, supports bounce buffers, and IOMMUs.

The second most important change was the addition of the Contiguous Memory Allocator (CMA). It is in 3.5, but is still marked as experimental. It has been tested on some systems, and Szyprowski hopes that it will be stabilizing over the next kernel cycle or so.

Lastly, a bunch of new attributes for DMA operations have been added. These are mostly for improving performance and to "avoid some hacks", Szyprowski said. For upcoming releases, he would like to work on better support for declaring coherent areas.

For 3.5, there was work to remove some of the limits on DMA, in particular, the 2MB limit on mappings. The fixed-sized coherent area has been replaced with memory from vmalloc(). That can't be done in atomic context, however, so there is a small pre-allocation for use in that context. For some devices that buffer was too small, so the size has been made platform dependent. The IOMMU implementation had no support for an atomic buffer at all, but patches have been posted recently, which he hopes to get into 3.6.

The IOMMU code is not particularly ARM-specific, Szyprowski said; it could be used for other architectures. There is a bit more work to isolate the common code and make it generic, but he would need to coordinate that work with the other architectures. Arnd Bergmann suggested just moving the code to a generic place, but leaving it turned off for other architectures. That would allow others interested to turn it on and try it out.

Bergmann noted that when CMA was proposed a year and a half ago, it was envisioned that it would be unconditionally built for all v6 and v7 platforms. But that would make all recent ARM architectures depend on an experimental feature, so he suggested that it might be time to turn off the experimental designation.

There are still some issues that need to be resolved before that can happen, Szyprowski said. There are cases where the allocation can fail because of different accounting between movable and non-movable regions. But Mel Gorman strongly recommended building CMA by default since the problems just result in an allocation failure, and did not cause a full system failure. He suggested making CMA the default with a fall-back to the old code if it fails. That way people will start using the feature, potentially see fall-back warnings, and help fix the problems. If it stays as an experimental feature, he fears that no one will actually use and test CMA.

Bergmann thought that any platform using a boot time reservation of memory (i.e. a "carve out") should be forced into using CMA. One of the problems with that idea is that some of the carve-outs are not upstream because they are for out-of-tree graphics hardware. In addition, the vendors are moving on and are no longer interested in adding features or updating their drivers to use a new feature like CMA.

Noting that there are multiple ways to do carve-outs, Gorman also suggested creating a core carve-out API for code consolidation. It could provide memory that is isolated or DMA-able, for example, so that all of the carve-outs in the kernel could use it. CMA could underlie that API, and it could implement the fall-back until CMA shakes out.

Fragmentation within CMA regions was mentioned as a concern. While Gorman didn't think it all that likely to happen in practice, some noted that there were already problems when using memory regions for OpenGL. User space actions can cause significant fragmentation in that case. Szyprowski suggested using separate CMA regions as a way to reduce the problem.

CMA still needs work to support highmem; there is no reason that it needs to be restricted to lowmem. Szyprowski hopes to get some time to work on that in the future. Wiring up CMA to x86 DMA is another thing that he plans to work on.

Comments (1 posted)

KS2012: ARM: Process review for the arm-soc tree

By Jake Edge
August 29, 2012

2012 Kernel Summit

Arnd Bergmann and Olof Johansson started day two of the ARM minisummit with a look at the arm-soc tree that they have been managing. They wanted to go over what has happened with the tree during the last year to see what was working and what could be improved. We are "trying to make you all happy", Bergmann said, while also trying to keep Linus Torvalds happy, which are conflicting goals at times.

The work split between the two has worked well, Bergmann said. When one of them has no time, the other has been able to pick up the slack. From a personal perspective, Bergmann said he is most unhappy when he has to reject a patch set. Actually it is worse when he has to make a decision about patches; some are easy to reject out of hand, but others are more difficult. If a huge patch set comes in, perhaps late in terms of getting it ready for the next merge window, or with lots of good patches but some that do "really nasty" things, he has to decide whether to reject it or not.

One thing to note, Bergmann said, is that Torvalds said that he is "not totally hating our guts anymore" in the Kernel Summit. That's progress. Paul Walmsley asked what things Torvalds is most sensitive to in terms of the ARM tree these days. Bergmann said that he was not sure what the problems are now, but, in the past, the totally uncoordinated nature of ARM development was the main problem.

It used to be that Torvalds would get 15 pull requests for various sub-architectures. That could lead to lots of merge conflicts and dependencies between trees, which annoyed him. The last merge window didn't have many of those problems. The number of patches was down slightly, but not hugely, and not enough to explain that reduction, Bergmann said.

Walmsley followed up by asking what the arm-soc maintainers would like to see from the sub-architecture maintainers. Johansson said that using signed Git tags would be very nice. That helps because the commit message ends up in the merge commit. It also identifies that the patches came from who they purport to, but the most important thing that signed tags bring is that message in the merge commit. Bergmann added that he tries to come up with something for the merge commit if there is no signed tag, but he would much rather get something from the maintainer directly.

One of the goals of the arm-soc tree is to facilitate (and force) the ARM cleanup process. The hope was that it would help pressure maintainers' managers to free up more time for that work. Bergmann asked if that process was working. Linus Walleij noted that the best pressure on management comes from customers, which, for him, are the handset and equipment manufacturers. Those manufacturers or Google make for an effective lever to change things. He is not sure how it came about (and Bergmann expressed surprise as well) but some customers are now asking for device tree support, which makes it easier to convince his management to spend time on that work.

Pushback from distributions is missing currently, Tony Lindgren said. Right now, ARM is distribution-unfriendly; device makers and SoC vendors are not getting the feedback to fix that. Walmsley wondered if the distribution and customer requirements would be in conflict, which could lead to problems.

Lindgren said that he sees tablets running different distributions in the future, but the device makers may not know what the distributions need. But Johansson cautioned that device makers aren't very interested in hearing from those who aren't shipping significant volumes of their product. Volumes of five and even six-digit numbers just aren't of that much interest to the device makers. For the most part those manufacturers are just following Android, Stephen Warren added.

Ben Dooks was concerned that ARM driver maintenance would suffer as those drivers move out of the arch/arm tree. Bergmann disagreed with that assessment because he thinks the overall work will become easier. The drivers will be centralized and use the same frameworks, so the maintenance burden will actually decrease.

Overall, there weren't many complaints about how things are going. For the most part, participants seemed pleased with how the arm-soc tree, and the overall ARM development process, was working. There's still plenty to do, but the process piece seems largely nailed down.

Comments (none posted)

KS2012: ARM: Toward a single kernel image

By Jake Edge
August 29, 2012

2012 Kernel Summit

Over the last two or three years, the ARM Linux development community has been working toward the goal of having a single kernel image that can boot on multiple ARM platforms. One of the preconditions for creating such an image is the elimination of duplicate header files in the tree, which has mostly been completed, Arnd Bergmann said. The biggest problem now is that platform-specific header files are included into the drivers. When building a multi-platform kernel, which of the platform's headers does the driver get?

Drivers really shouldn't be including the platform-specific headers (from the mach-* directories), but many do. There are 300-350 header files under mach-* that are currently used by drivers. There are a number of reasons why this happened: various frameworks were missing for things like power domains, it was easier to add a header file into a directory that is owned by the platform rather than arguing about getting it into a more generic place, and so on. mach-* became a dumping ground, Bergmann said.

He has a patch that would rename all of those include files so that the platform name becomes the prefix of the header filename. It also changes the references in the driver source files to include the proper file. That doesn't solve the dumping ground problem, it simply works around it so that multi-platform kernels can be built.

Bergmann said that ARM maintainer Russell King was not in favor of that approach. Instead, King would like to use the single zImage as something of a carrot to get the sub-architecture maintainers to clean things up. King thinks that the platform-specific directories should not be in the include path for building the drivers, which would force the issue.

One participant suggested that there aren't that many things to fix per platform, but Bergmann disagreed. There is a lot of work to do for some platforms, including some of those that are the "most interesting", such as Samsung and OMAP.

Magnus Damm suggested that checkpatch be extended to complain about drivers that include files from the platform-specific directories. That would help to ensure new drivers were not including improper headers. But, Bergmann said that he didn't use checkpatch before accepting patches, though he admitted that maybe he should do so. Paul Walmsley said that OMAP requires patches to be checkpatch clean (other than 80 column warnings) before accepting them.

Rob Herring has an alternative approach that is likely to be more acceptable to King. He has reworked the header files without renaming them, which reduces the code churn. There are still three problematic header files, though: uncompress.h, gpio.h, and timex.h. But Herring can build a number of platforms into a single zImage without using Bergmann's renaming trick.

Bergmann wanted to see if the assembled ARM developers could come to a conclusion on the right approach. Basically, either of the two header file rearrangement solutions could solve the technical problems in building multi-platform kernels, but they wouldn't force the cleanups that King would like to see happen. In general, most in the room seemed in favor of getting things cleaned up so that there is a clean separation between drivers and platforms—as King has advocated.

It is a perfect task for Linaro, as Walmsley pointed out. It was noted that the worst offenders are all Linaro members, which makes it align well with the organization's mission. Bergmann said that Linaro has some people working on multi-platform kernels who could potentially work on the project.

The conversation turned toward how to get there. Tony Lindgren said that he could do an initial pass on OMAP in the next week or so to start to figure out how to fix up the drivers. There are certain frameworks (common clock, sparse IRQ) that drivers and platforms will be required to use in order to be included in single zImage effort. In addition, SMP-capable platforms will need to use Marc Zyngier's smp_ops framework, which Bergmann will be reworking and posting in the near future.

Using Herring's header file changes, but not renaming all the mach-* include files, is the basic approach chosen. That will still use some parts of Bergmann's changes. In the end, it will still be a fair amount of code churn, so there was discussion of how to manage those changes in the arm-soc tree. The intent is to try to make it work for the 3.7 cycle, with a fallback to making those changes the base patch for the arm-soc tree for 3.8 if it gets too messy.

Bergmann also demonstrated the Kconfig changes that he has made so that kernel developers can enable multiple platforms in their kernel builds. Once multi-platform support is selected, then one or more of the ARM architecture versions (v4-7) can be chosen. For each architecture, possible SoCs are listed and, if none is chosen, a default is picked. In addition, SoC maintainers can decide whether to expose individual boards for selection. Those Kconfig changes could be used as the basis for building multi-platform kernels once the rest of the work is done.

The header file renaming script will still be useful, Bergmann said, to help figure out the include file dependencies, which drivers require which platforms, and so on. Using shell tools and grep on a renamed tree can give some insights into how things are currently organized. That will help as these driver problems are unwound on the way to a multi-platform ARM kernel image.

Comments (1 posted)

Patches and updates

Kernel trees

Greg KH Linux 3.5.3 ?

Greg KH Linux 3.4.10 ?

Steven Rostedt 3.4.10-rt18 ?

Steven Rostedt 3.2.28-rt41 (this is for real) ?

Greg KH Linux 3.0.42 ?

Architecture-specific

Gerald Schaefer thp: transparent hugepages on System z ?

Fenghua Yu x86: Arbitrary CPU hot(un)plug support ?

Rob Herring ARM: initial multiplatform support ?

Core kernel code

pjt@google.com sched: per-entity load-tracking ?

aris@redhat.com cgroup: add xattr support ?

Eric W. Biederman userns subsystem conversions ?

Daniel Wagner cgroup: Assign subsystem IDs during compile time ?

Kees Cook module: allow loading module from fd ?

Device drivers

Yann Cantin new USB eBeam input driver ?

Thomas Abraham pinctrl: add support for samsung pinctrl driver ?

G.Shark Jeong leds: Add new LED driver for lm355x chips ?

Arun Murthy modem_shm: U8500 SHaRed Memory driver(SHRM) ?

Dong Aisheng add syscon driver based on regmap for general registers access ?

larsi@wh2.tu-dresden.de mfd: viperboard driver added ?

cjren@qca.qualcomm.com net: add new QCA alx ethernet driver ?

Kent Yoder tpmdd: TPM drivers, tpm-rng and fixes ?

Naresh Kumar Inna csiostor: Chelsio FCoE offload driver submission ?

Krystian Garbaciak DA906x PMIC driver ?

Documentation

Michael Kerrisk (man-pages) man-pages-3.42 is released ?

Filesystems and block I/O

Cyrill Gorcunov extended fdinfo via procfs series, v7 ?

Goffredo Baroncelli BTRFS sysfs support ?

Josef Bacik Btrfs: turbo charge fsync ?

Paul Clements add discard support to nbd ?

Memory management

Minchan Kim mm: support MIGRATE_DISCARD ?

Rafael Aquini make balloon pages movable by compaction ?

wency@cn.fujitsu.com memory-hotplug: hot-remove physical memory ?

Networking

Julian Anastasov Interface for TCP Metrics ?

Vlad Yasevich Add basic VLAN support to bridges ?

Patrick McHardy netfilter: IPv6 NAT ?

Virtualization and containers

Wen Congyang kvm: notify host when the guest is panicked ?

Dong Hao KVM: perf: kvm events analysis tool ?

Paolo Bonzini Multiqueue virtio-scsi ?

Miscellaneous

Mathieu Desnoyers Userspace RCU 0.7.4 ?

Page editor: Jonathan Corbet
Next page: Distributions>>