Re: [PATCH RFC] mm: mglru: provide a separate list for lazyfree anon folios

Barry Song <21cnbao@xxxxxxxxx> · Tue, 24 Sep 2024 10:38:37 +1200

On Tue, Sep 24, 2024 at 10:19 AM Minchan Kim <minchan@xxxxxxxxxx> wrote:
>
> On Fri, Sep 20, 2024 at 01:23:57PM +1200, Barry Song wrote:
> > On Wed, Sep 18, 2024 at 12:02 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
> > >
> > > On 14.09.24 08:37, Barry Song wrote:
> > > > From: Barry Song <v-songbaohua@xxxxxxxx>
> > > >
> > > > This follows up on the discussion regarding Gaoxu's work[1]. It's
> > > > unclear if there's still interest in implementing a separate LRU
> > > > list for lazyfree folios, but I decided to explore it out of
> > > > curiosity.
> > > >
> > > > According to Lokesh, MADV_FREE'd anon folios are expected to be
> > > > released earlier than file folios. One option, as implemented
> > > > by Gao Xu, is to place lazyfree anon folios at the tail of the
> > > > file's `min_seq` generation. However, this approach results in
> > > > lazyfree folios being released in a LIFO manner, which conflicts
> > > > with LRU behavior, as noted by Michal.
> > > >
> > > > To address this, this patch proposes maintaining a separate list
> > > > for lazyfree anon folios while keeping them classified under the
> > > > "file" LRU type to minimize code changes. These lazyfree anon
> > > > folios will still be counted as file folios and share the same
> > > > generation with regular files. In the eviction path, the lazyfree
> > > > list will be prioritized for scanning before the actual file
> > > > LRU list.
> > > >
> > >
> > > What's the downside of another LRU list? Do we have any experience on that?
> >
> > Essentially, the goal is to address the downsides of using a single LRU list for
> > files and lazyfree anonymous pages - seriously more files re-faults.
> >
> > I'm not entirely clear on the downsides of having an additional LRU
> > list. While it
> > does increase complexity, it doesn't seem to be significant.
>
> It's not a catastrophic[1]. I prefer the idea of an additional LRU
> because it offers flexibility for various potential use cases[2].
>
> orthgonal topic(but may be interest for someone)
>
> My main interest in a new LRU list is to enable the system to maintain a
> quickly reclaimable memory pool and expose the size to the admin with
> a knob to decide how many memory pool they want.
>
> This pool would consist of clean, unmapped pages from both the page cache
> and/or the swap cache. This would allow the system to reclaim memory quickly
> when free memory is low, at the cost of minor fault overhead.

My current implementation only handles the MADV_FREE anonymous case. If they
are placed in a single LRU, they should be able to be reclaimed very
quickly, simply
discarded without needing to be swapped out.

I've been thinking about the issue of unmapped pagecache recently.
These unmapped
pagecaches can be reclaimed much faster than mapped ones, especially
when the latter
have a high mapcount and incur significant rmap costs. However, many
pagecaches are
inherently unmapped (e.g., from syscall read). If they are placed in a
single LRU, the
challenge would be comparing the age of unmapped pagecache with mapped ones.
Currently, with the mglru tier mechanism, frequently accessed unmapped
pagecaches
have a chance to be placed in a spot where they are harder to reclaim.

personally I am quite interested in putting unmapped pagecache
together as right now
reclamation could be like this:

lru list:
unmapped pagecache(A) - mapped pagecached(B) - unmapped pagecache(C) - mapped
pagecached with huge mapcount(D)

A and C can be reclaimed with zero cost but they have to wait for D and B.

But the question is that if make two lists:

list1: A - C
list2: B - D

How can we ensure that A and C won't experience many refaults, even though
reclaiming them would be cost-free? Or that B and D might actually be
colder than
A and C?

If this isn't an issue, I'd be very interested in implementing it. Any thoughts?

>
> [1] https://lore.kernel.org/linux-kernel//1448006568-16031-15-git-send-email-minchan@xxxxxxxxxx/
> [2] https://lkml.org/lkml/2012/6/19/24

Thanks
Barry