Re: [PATCH v3 0/7] mm: workingset reporting

David Rientjes <rientjes@xxxxxxxxxx> · Thu, 15 Aug 2024 20:14:44 -0700 (PDT)

On Tue, 13 Aug 2024, Andrew Morton wrote:

> On Tue, 13 Aug 2024 09:56:11 -0700 Yuanchu Xie <yuanchu@xxxxxxxxxx> wrote:
> 
> > This patch series provides workingset reporting of user pages in
> > lruvecs, of which coldness can be tracked by accessed bits and fd
> > references.
> 
> Very little reviewer interest.  I wonder why.  Will Google be the only
> organization which finds this useful?
> 

Although also from Google, I'm optimistic that others will find this very 
useful.  It's implemented in a way that is intended to be generally useful 
for multiple use cases, including user defined policy for proactive 
reclaim.  The cited sample userspace implementation is intended to 
demonstrate how this insight can be put into practice.

Insight into the working set of applications, particularly on multi-tenant 
systems, has derived significant memory savings for Google over the past 
decade.  The introduction of MGLRU into the upstream kernel has allowed 
this information to be derived in a much more efficient manner, presented 
here, that should make upstreaming of this insight much more palatable.

This insight into working set will only become more critical going forward 
with memory tiered systems.

Nothing here is specific to Google; in fact, we apply the insight into 
working set in very different ways across our fleets.

> > Benchmarks
> > ==========
> > Ghait Ouled Amar Ben Cheikh has implemented a simple "reclaim everything
> > colder than 10 seconds every 40 seconds" policy and ran Linux compile
> > and redis from the phoronix test suite. The results are in his repo:
> > https://github.com/miloudi98/WMO
> 
> I'd suggest at least summarizing these results here in the [0/N].  The
> Linux kernel will probably outlive that URL!
> 

Fully agreed that this would be useful for including in the cover letter.  

The results showing the impact of proactive reclaim using insight into 
working set is impressive for multi-tenant systems.  Having very 
comparable performance for kernbench with a fraction of the memory usage 
shows the potential for proactive reclaim and without the dependency on 
direct reclaim or throttling of the application itself.

This is one of several benchmarks that we are running and we'll be 
expanding upon this with cotenancy, user defined latency senstivity per 
job, extensions for insight into memory re-access, and in-guest use cases.