Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Reclamation interactions with RCU

Kent Overstreet <kent.overstreet@xxxxxxxxx> · Thu, 29 Feb 2024 23:15:10 -0500

On Fri, Mar 01, 2024 at 11:08:52AM +0700, James Bottomley wrote:
> On Thu, 2024-02-29 at 22:52 -0500, Kent Overstreet wrote:
> > On Fri, Mar 01, 2024 at 10:33:59AM +0700, James Bottomley wrote:
> > > On Thu, 2024-02-29 at 22:09 -0500, Kent Overstreet wrote:
> > > > Or maybe you just want the syscall to return an error instead of
> > > > blocking for an unbounded amount of time if userspace asks for
> > > > something silly.
> > > 
> > > Warn on allocation above a certain size without MAY_FAIL would seem
> > > to cover all those cases.  If there is a case for requiring instant
> > > allocation, you always have GFP_ATOMIC, and, I suppose, we could
> > > even do a bounded reclaim allocation where it tries for a certain
> > > time then fails.
> > 
> > Then you're baking in this weird constant into all your algorithms
> > that doesn't scale as machine memory sizes and working set sizes
> > increase.
> > 
> > > > Honestly, relying on the OOM killer and saying that because that
> > > > now we don't have to write and test your error paths is a lazy
> > > > cop out.
> > > 
> > > OOM Killer is the most extreme outcome.  Usually reclaim (hugely
> > > simplified) dumps clean cache first and tries the shrinkers then
> > > tries to write out dirty cache.  Only after that hasn't found
> > > anything after a few iterations will the oom killer get activated
> > 
> > All your caches dumped and the machine grinds to a halt and then a
> > random process gets killed instead of simply _failing the
> > allocation_.
> 
> Ignoring the fact free invective below, I think what you're asking for
> is strict overcommit.  There's a tunable for that:
> 
> https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
> 
> However, see the Gotchas section for why we can't turn it on globally,
> but it is available to you if you know what you're doing.

James, I already explained all this.