Memory-management optimization with DAMON

By Jonathan Corbet
February 20, 2020

To a great extent, memory management is based on making predictions: which pages of memory will a given process need in the near future? Unfortunately, it turns out that predictions are hard, especially when they are about future events. In the absence of useful information sent back from the future, memory-management subsystems are forced to rely on observations of recent behavior and an assumption that said behavior is likely to continue. The kernel's memory-management decisions are opaque to user space, though, and often result in less-than-optimal performance. A pair of patch sets from SeongJae Park tries to make memory-usage patterns visible to user space, and to let user space change memory-management decisions in response.

At the core of this new mechanism is the data access monitor or DAMON, which is intended to provide information on memory-access patterns to user space. Conceptually, its operation is simple; DAMON starts by dividing a process's address space into a number of equally sized regions. It then monitors accesses to each region, providing as its output a histogram of the number of accesses to each region. From that, the consumer of this information (in either user space or the kernel) can request changes to optimize the process's use of memory.

Reality is a bit more complex than that, of course. Current hardware allows for a huge address space, most of which is unused; dividing that space into (for example) 1000 regions could easily result in all of the used address space being pushed into just a couple of regions. So DAMON starts by splitting the address space into three large chunks which are, to a first approximation, the text, heap, and stack areas. Only those areas are monitored for access patterns.

For each region, DAMON tries to track the number of accesses. Watching every page in a region would be expensive, though, and one of the design goals of DAMON is to be efficient enough to run on production workloads. These objectives are reconciled by assuming that all pages in a given region have approximately equal access patterns, so there is no need to watch more than one of them. Thus, within each region, the "accessed" bit on a randomly selected page is cleared, then occasionally checked. If that page has been accessed, then the region is deemed to have been accessed.

It would be nice if a process being monitored would helpfully line up its memory-access patterns to match the regions chosen by DAMON, but such cooperation is rare in real-world systems. So the layout of those equally sized regions is unlikely to correspond well with how memory is actually being used. DAMON attempts to compensate for this by adjusting the regions on the fly as the process executes. Regions showing heavy access patterns are divided into smaller areas, while those seeing little use are coalesced into larger blocks. If all this works well, the result over time should be a zeroing-in on the truly hot areas of the target process's address space.

To control all of this, DAMON creates a set of virtual files in the debugfs filesystem. There is no access control implemented within DAMON itself, but those files are set up for root access only by default. All of the relevant parameters — target process, number of regions, and sampling and aggregation periods — can be configured by writing to those files. The resulting data can be read from debugfs; it is also possible to have the kernel write sampling data directly to a file, from which it can be processed at leisure. As an alternative, users can attach to a tracepoint to receive the data as it is generated; this makes it readily available to the perf tool, among other things.

That data, however it is obtained, is essentially a histogram; each memory region is a bin and the number of hits in that bin is recorded. That data can be analyzed by hand, of course; there is also a sample script that can feed it to gnuplot to present the information in a more graphic form. This information, Park says, can be highly useful:

To see the usefulness of the monitoring, we optimized 9 memory intensive workloads among them for memory pressure situations using the DAMON outputs. In detail, we identified frequently accessed memory regions in each workload based on the DAMON results and protected them with mlock() system calls. The optimized versions consistently show speedup (2.55x in best case, 1.65x in average) under memory pressure.

That kind of speedup certainly justifies spending some time looking at a process's memory patterns. It would be even nicer, though, if the kernel could do that work itself — that is what a memory-management subsystem is supposed to be for, after all. As a step in that direction, Park has posted a separate patch set implementing the "data access monitoring-based memory operation schemes". This mechanism allows users to tell DAMON how to respond to specific sorts of access patterns. This is done through a new debugfs file ("schemes") that accepts lines like:

    min-size max-size min-acc max-acc min-age max-age action

Each rule will apply to regions between min-size and max-size in length with access counts between min-acc and max-acc. These counts must have been accumulated in a region with an age between min-age and max-age. The "age" of a region is reset whenever a significant change happens; this can include the application of an action or a resizing of the region itself.

The action is, at this point, a command to be passed to an madvise() call on the region; supported values are MADV_WILLNEED, MADV_COLD, MADV_PAGEOUT, MADV_HUGEPAGE, and MADV_NOHUGEPAGE. Actions of this type could be used to, for example, explicitly force out a region that sees little use or to request that huge pages be used for hot regions. Comments within the patch set suggest that mlock is also envisioned as an action, but that is not currently implemented.

A mechanism like this has clear value when it comes to helping developers tune the memory-management subsystem for their workloads. It raises an interesting question, though: given that the kernel can be made to tune itself for better memory-management results, why isn't this capability a part of the memory-management subsystem itself? Bolting it on as a separate module might be useful for memory-management developers, who are likely interested in trying out various ideas. But one might well argue that production systems should Just Work without the need for this sort of manual tweaking, even if the tweaking is supported by a capable monitoring system. While DAMON looks like a useful tool now, users may be forgiven for hoping that it makes itself obsolete over time.

Index entries for this article
Kernel	Memory management/DAMON
Kernel	Releases/5.15

Memory-management optimization with DAMON

Posted Feb 20, 2020 15:57 UTC (Thu) by sjpark (subscriber, #87716) [Link]

Appreciate this great introduction of my patchsets, Jon!

I also made several web pages showing DAMON outputs in more intuitive (visualized) way: https://lore.kernel.org/linux-mm/20200220081710.15211-1-s...
You can show how the data access pattern monitored by DAMON seems like there. Nonetheless, the pages are showing only visualized data access patterns and related analysis results. More reports including performance test results will be also available soon at https://damonitor.github.io/reports/latest/. Stay tuned!

Also, I would like to give my answer to the last question: "given that the kernel can be made to tune itself for better memory-management results, why isn't this capability a part of the memory-management subsystem itself?"

Actually, DAMON provides two interfaces. The interfaces using debugfs and the tracepoints are for privileged userspace programs, profilers, and people. Beside of the interface, DAMON also provides a programmable interface for kernel code. Using this programmable interface, the memory-management subsystem can use DAMON. Indeed, making the system 'Just Work' in optimal way using DAMON is the ultimate goal of this project.

That said, I believe the debugfs interfaces could still be an useful and easy-to-control knob for environments having unique characteristics. Of course, third party kernel modules using the programmable interface for complex schemes which cannot described with the simple format ('min-size max-size min-acc max-acc min-age max-age action') for specific environments are also imaginable.

Memory-management optimization with DAMON

Posted Feb 21, 2020 0:16 UTC (Fri) by nickodell (subscriber, #125165) [Link] (3 responses)

>Unfortunately, it turns out that predictions are hard, especially when they are about future events.

Aren't all predictions about future events?

Memory-management optimization with DAMON

Posted Feb 21, 2020 1:08 UTC (Fri) by xanni (subscriber, #361) [Link]

thatsthejoke.gif

But also, when testing scientific hypotheses it's common to make predictions about current or past events in order to determine the accuracy of the predictions. This is how models are created.

Memory-management optimization with DAMON

Posted Feb 21, 2020 2:45 UTC (Fri) by gus3 (guest, #61103) [Link]

You're conflating "distant" future with "immediate" future.

I can predict that smacking your thumb with a hammer will cause pain. That is "immediate."

But I cannot predict that you *will* smack your thumb with a hammer, anytime today. That is "future."

The immediate future is usually knowable. The distant future, not so much.

Memory-management optimization with DAMON

Posted Feb 21, 2020 7:28 UTC (Fri) by josh (subscriber, #17465) [Link]

https://quoteinvestigator.com/2013/10/20/no-predict/

Memory-management optimization with DAMON

Posted Feb 27, 2020 11:31 UTC (Thu) by Karellen (subscriber, #67644) [Link] (1 responses)

we identified frequently accessed memory regions in each workload based on the DAMON results and protected them with mlock() system calls. The optimized versions consistently show speedup (2.55x in best case, 1.65x in average) under memory pressure.

If feel like "under memory pressure" should be emphasised here. Yes, if you're low on memory and the system is needing to swap, then preventing some regions of memory from being swapped could certainly improve the performance... of that particular application.

However, if you're low on memory and the system is needing to swap, then preventing some regions of memory from being swapped is probably going to have an adverse effect on all the other applications on the system.

That kind of speedup certainly justifies spending some time looking at a process's memory patterns. It would be even nicer, though, if the kernel could do that work itself — that is what a memory-management subsystem is supposed to be for, after all.

I feel like the memory management subsystem should be for making the system as fair as possible?

Memory-management optimization with DAMON

Posted Mar 2, 2020 11:52 UTC (Mon) by sjpark (subscriber, #87716) [Link]

For the test, I ran the application in a seperated cgroup. Thus, the speedup was not came from such sacrifices of other processes. You can show the detailed setup of the tests in the related paper: https://dl.acm.org/doi/10.1145/3366626.3368125

Memory-management optimization with DAMON

Posted Jul 23, 2021 18:56 UTC (Fri) by scientes (guest, #83068) [Link] (1 responses)

> To a great extent, memory management is based on making predictions: which pages of memory will a given process need in the near future?

That is what *caching* is about. We have random access memory for a reason, and even a MMU and the overhead it brings (easily turning a single access into 5 accesses) is unnecessary (and does not mean insecurity, although that is the general assumption). You've locked yourself into an overcommit world, and then are trying to remove all the dung that comes with it, but I really think that is the wrong approach.

MMUs were designed the way they are because it was a virtual machine approach to DOS (and then history repeats itsself with what we now call virtual machines), but it isn't the most efficient method of adding multi-process memory access security to RAM, with the power-hungry TLBs.

Memory-management optimization with DAMON

Posted Sep 7, 2022 9:46 UTC (Wed) by gerlash (guest, #160715) [Link]

Do you have some pointers to an alternative?