Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag

"Paul E. McKenney" <paulmck@xxxxxxxxxx> · Tue, 18 Aug 2020 09:18:46 -0700

On Tue, Aug 18, 2020 at 05:02:32PM +0200, Michal Hocko wrote:
> On Tue 18-08-20 06:53:27, Paul E. McKenney wrote:
> > On Tue, Aug 18, 2020 at 09:43:44AM +0200, Michal Hocko wrote:
> > > On Mon 17-08-20 15:28:03, Paul E. McKenney wrote:
> > > > On Mon, Aug 17, 2020 at 10:28:49AM +0200, Michal Hocko wrote:
> > > > > On Mon 17-08-20 00:56:55, Uladzislau Rezki wrote:
> > > > 
> > > > [ . . . ]
> > > > 
> > > > > > wget ftp://vps418301.ovh.net/incoming/1000000_kmalloc_kfree_rcu_proc_percpu_pagelist_fractio_is_8.png
> > > > > 
> > > > > 1/8 of the memory in pcp lists is quite large and likely not something
> > > > > used very often.
> > > > > 
> > > > > Both these numbers just make me think that a dedicated pool of page
> > > > > pre-allocated for RCU specifically might be a better solution. I still
> > > > > haven't read through that branch of the email thread though so there
> > > > > might be some pretty convincing argments to not do that.
> > > > 
> > > > To avoid the problematic corner cases, we would need way more dedicated
> > > > memory than is reasonable, as in well over one hundred pages per CPU.
> > > > Sure, we could choose a smaller number, but then we are failing to defend
> > > > against flooding, even on systems that have more than enough free memory
> > > > to be able to do so.  It would be better to live within what is available,
> > > > taking the performance/robustness hit only if there isn't enough.
> > > 
> > > Thomas had a good point that it doesn't really make much sense to
> > > optimize for flooders because that just makes them more effective.
> > 
> > The point is not to make the flooders go faster, but rather for the
> > system to be robust in the face of flooders.  Robust as in harder for
> > a flooder to OOM the system.
> 
> Do we see this to be a practical problem? I am really confused because
> the initial argument was revolving around an optimization now you are
> suggesting that this is actually system stability measure. And I fail to
> see how allowing an easy way to deplete pcp caches completely solves
> any of that. Please do realize that if allow that then every user who
> relies on pcp caches will have to take a slow(er) path and that will
> have performance consequences. The pool is a global and a scarce
> resource. That's why I've suggested a dedicated preallocated pool and
> use it instead of draining global pcp caches.

Both the optimization and the robustness are important.

The problem with this thing is that I have to start describing it
somewhere, and I have not yet come up with a description of the whole
thing that isn't TL;DR.

> > And reducing the number of post-grace-period cache misses makes it
> > easier for the callback-invocation-time memory freeing to keep up with
> > the flooder, thus avoiding (or at least delaying) the OOM.
> > 
> > > > My current belief is that we need a combination of (1) either the
> > > > GFP_NOLOCK flag or Peter Zijlstra's patch and
> > > 
> > > I must have missed the patch?
> > 
> > If I am keeping track, this one:
> > 
> > https://lore.kernel.org/lkml/20200814215206.GL3982@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> 
> OK, I have certainly noticed that one but didn't react but my response
> would be similar to the dedicated gfp flag. This is less of a hack than
> __GFP_NOLOCK  but it still exposes very internal parts of the allocator
> and I find that a quite problematic from the future maintenance of the
> allocator. The risk of an easy depletion of the pcp pool is also there
> of course.

I had to ask.  ;-)

							Thanx, Paul