ARM's multiply-mapped memory mess
First, some background. Contemporary processors do not normally address memory directly; instead, memory accesses are mediated through mappings created in the hardware's memory management unit. Depending on the processor, those mappings may be controlled through segment registers, page table entries, or some other means. The mapping will translate a virtual address into a physical address, but it also controls how the processor will access that memory and, perhaps, cache its contents.
As explained by ARM maintainer Russell King, ARM processors have a number of attributes which affect how memory mappings work. There is the concept of a memory type; "normal memory" is subject to reordering of reads and writes, while "device memory" is not, for example. There is also a bit indicating whether memory can be shared between processors; unshared memory is faster because there is no need to worry about cross-processor cache coherency. Then, like many CPUs, ARM processors can specify different caching behavior in the mapping; RAM might be mapped with writeback caching enabled, while device memory is uncached.
The ARM kernel maps RAM as normal memory with writeback caching; it's also marked non-shared on uniprocessor systems. The ioremap() system call, used to map I/O memory for CPU use, is different: that memory is mapped as device memory, uncached, and, maybe, shared. These different mappings give the expected behavior for both types of memory. Where things get tricky is when somebody calls ioremap() to create a new mapping for system RAM.
The problem with these multiple mappings is that they will have differing attributes. As of version 6 of the ARM architecture, the specified behavior in that situation is "unpredictable." Users, as a rule, are not enamored of "unpredictable' behavior, especially when their data is involved. So it would make sense to avoid multiple memory mappings with differing attributes. The ARM architecture has traditionally allowed this kind of mapping, though, and a number of drivers, as a result, rely on being able to remap RAM in this way.
Back in April, Russell raised an alarm about this issue, and posted a patch causing ioremap() to fail when the target is system RAM. This change avoids the potential data corruption issue, but at the cost of breaking every driver using ioremap() in this way. There were complaints at the time, so the patch sat out the 2.6.35 development cycle, but Russell merged it for 2.6.36. There it sat until, with the release imminent, Felipe Contreras posted a patch backing out the change, saying:
Russell was not impressed. In his view, remapping RAM in this way is a dangerous technique which will lead to data corruption sooner or later. Despite being warned six months ago, driver developers have not fixed the problem - there are as many broken drivers now as there were before. So, he says, there is no benefit to waiting any longer; the dangerous behavior should be stopped before somebody gets burned.
On the other side, driver developers point out that everything "seems to work" as it is, so there is no urgent need for change. Furthermore, Russell's patch looks to them like an API change, and the normal rule of kernel development is that anybody making internal API changes is charged with cleaning up any resulting messes. Fixing the drivers is not a trivial task, and it's Russell's contention that they have always been broken, so he is not willing (or necessarily able) to make them all work again.
The situation looked stalled, with a reversion of the patch looking like the only way forward. But, in fact, it looks like there is a way out. The first is to allow those mappings for one more cycle, but to put in a user-visible warning when they happen. As Andrew Morton put it:
It is the "worried users" who have been missing from the equation so far; they can provide a type of pressure which, seemingly, is unavailable to worried subsystem maintainers.
The other piece of the solution is to give driver developers a way to obtain a chunk of physically-contiguous RAM which can be remapped in this way. Such memory cannot be mapped simultaneously as system RAM. One nice idea would be to simply unmap system memory when it is put to a device's use, but that proves to be difficult to implement. The alternative is to just set aside some memory at boot time and never let the kernel use it for any other purpose; drivers can then allocate from that pool when they need to. Russell has posted a patch which makes this kind of memory set-aside possible.
So this particular situation will probably have a happy outcome, presuming
that the above outcome happens and that that no users are burned by
unpredictable mappings with the 2.6.36 kernel. But it highlights some
ongoing problems. It can be very hard to get developers to fix things,
especially if the current code "seems to work." Those developers also
became aware of the change at a very late date - if, indeed, they are even
aware of it now. It seems that testing of -rc kernels by developers is not
happening as much as we would like. Still, the development process seems
to work, and problems like this are overcome eventually.
Index entries for this article | |
---|---|
Kernel | Architectures/Arm |
Kernel | Development model |
Kernel | ioremap() |
Posted Oct 14, 2010 3:05 UTC (Thu)
by corbet (editor, #1)
[Link] (1 responses)
Posted Oct 14, 2010 6:24 UTC (Thu)
by ebiederm (subscriber, #35028)
[Link]
Can't the ARM guys just generalize the hard work that happened on x86 to guarantee to pages never get mapped to userspace with different attributes?
Posted Oct 14, 2010 6:42 UTC (Thu)
by dlang (guest, #313)
[Link] (8 responses)
but even if it only took a week for a distro to QA and ship a kernel, and users started putting pressure on immediately, there is no chance for any development work to take place and still make it into the mainline kernel before the merge window closes.
it will probably take multiple kernel cycles before the warning is seen by any users who don't compile their own kernels, and then more time for the pressure from the users to build up, and then development time to fix the problem, and then time to merge the fixes upstream.
I don't see this taking place in less than a year, and probably a lot longer before everything is fixed.
Posted Oct 14, 2010 13:24 UTC (Thu)
by NAR (subscriber, #1313)
[Link] (7 responses)
Posted Oct 14, 2010 15:00 UTC (Thu)
by marcH (subscriber, #57642)
[Link]
Posted Oct 14, 2010 15:19 UTC (Thu)
by felixfix (subscriber, #242)
[Link]
Posted Oct 15, 2010 12:53 UTC (Fri)
by drag (guest, #31333)
[Link] (2 responses)
ARM 6 is essentially a specification for a processor and not a real processor. Using the memory in the way the kernel does is 'unspecified'. It could work on today's processors made by Ti, but it could completely backfire on tomorrow's processors made by Marvel.
It's impossible to know and if it does start corrupting memory in the kernel then it's going to be 100% ok as far as the processor designers are concerned because they are still following the specification.
It's similar to having the kernel rely on unspecified GCC features were once a user chooses a GCC version they are forced to use it for ever and cannot change it no matter how badly it works with Linux.
Posted Oct 15, 2010 18:34 UTC (Fri)
by dlang (guest, #313)
[Link] (1 responses)
Posted Oct 15, 2010 22:17 UTC (Fri)
by gnb (subscriber, #5132)
[Link]
Posted Oct 17, 2010 15:58 UTC (Sun)
by oak (guest, #2786)
[Link]
Posted Oct 18, 2010 20:19 UTC (Mon)
by mwh (guest, #582)
[Link]
Posted Oct 14, 2010 15:08 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (2 responses)
Finally rejecting memory corruption is an API change? I can imagine such a logic backfiring and making his author not look serious.
Posted Oct 16, 2010 22:33 UTC (Sat)
by jzbiciak (guest, #5246)
[Link] (1 responses)
gets() is still in the C language, despite it being next to impossible to use safely. But if you link against it, some platforms give you a big scary warning. If you're standards-conforming, you have to have it though. Removing it is an API/ABI change. In some ways, the situation with ARM's memory remapping sounds similar: a dangerous practice that seems to work suitably, despite the potential for it to go pear shaped when you least expect.
Posted Oct 17, 2010 13:04 UTC (Sun)
by marcH (subscriber, #57642)
[Link]
... whereas ioremap() is a Linux internal.
> despite it being next to impossible to use safely.
... whereas using ioremap() safely is not a problem.
> Removing it is an API/ABI change.
... whereas no one suggests removing ioremap().
> In some ways, the situation with ARM's memory remapping sounds similar
Not really.
Posted Dec 16, 2010 3:52 UTC (Thu)
by tiffang (guest, #48653)
[Link]
Since the article was posted, Linus has merged a patch re-enabling ioremap() on system RAM which emits a big, scary warning.
Followup
Puzzled
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
ARM's multiply-mapped memory mess
flush dcaches, mmu entries.. then give that clean page to device drivers.
And rollback when device drivers give it back afterward.
Can it be possible?