Re: [LSF/MM/BPF TOPIC] Physical LRU scanning feasibility

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Wed, 8 Jan 2025 14:33:01 -0800

On Wed, Jan 8, 2025 at 1:46 PM Lorenzo Stoakes
<lorenzo.stoakes@xxxxxxxxxx> wrote:
>
> Hi all,
>
> Not too long ago I took some time to investigate the possibility of
> scanning physical memory directly by traversing the memory map directly
> rather than the LRU linked list.
>
> This was inspired by a post from Matthew [0] wherein he demonstrated just
> how significant the difference is between traversing arrays of contiguous
> data on a modern system vs. the almost worst-case scenario of traversing a
> linked-list.
>
> I tested how this might look by implementing code which simply traverses
> and filters the memory map for LRU pages, simplifying as much as possible.
>
> However no matter which machine (ranging from 16 GB - 192 GB) or whether
> virtualised or real hardware, I found unfortunately disappointing results -
> the act of having to scan such a large range of memory resulted in
> performance significantly less than a typical LRU scan at low memory
> utilisation and performance at best matching LRU scanning at high memory
> utilisation (simulating higher memory pressure).
>
> There are a number of factors at play here, and perhaps the shrinkage of
> struct page (allowing for denser placement in cache lines), or an improved
> algorithm might lead to more promising results.
>
> Having discussed this with Matthew, he suggested I put forward a proposal
> to discuss this area in order that we can learn from this should it appear
> this approach is unworkable or perhaps determine whether there might be
> something to this that we might still salvage.
>
> I intend to do some more research and generate some more specific numbers
> (feel free to give feedback here) before LSF so we can have something more
> specific to talk about.
>
> I always envisioned this approach being somehow integrated with MGLRU and I
> wonder if some hybrid means of integrating this approach with the MGLRU one
> might make sense, which could also be another area of discussion.

When I read this proposal the first thing that came to mind was memcg
reclaim. While it seems to me that it is already inefficient to scan
all physical memory looking for a possible reclaim candidate, it seems
even more inefficient to try to find a possible reclaim candidate
within the needed memcg. We'll also probably do a lot of repeated
scanning as we iterate memcgs. The per-memcg per-node LRUs save us
from this.

Do you have an idea about handling memcg reclaim efficiently?