Re: [PATCH] mm, percpu: do not consider sleepable allocations atomic

Michal Hocko <mhocko@xxxxxxxx> · Fri, 14 Feb 2025 16:52:42 +0100

On Wed 12-02-25 13:39:31, Dennis Zhou wrote:
> Hello,
> 
> On Wed, Feb 12, 2025 at 11:30:08AM -1000, Tejun Heo wrote:
> > Hello,
> > 
> > On Wed, Feb 12, 2025 at 09:53:20PM +0100, Michal Hocko wrote:
> > ...
> > > > Hmm... you'd a better judge on whether that'd be okay or not but it does
> > > > bother me that we might be increasing the chance of allocation failures for
> > > > GFP_KERNEL users at least under memory pressure.
> > > 
> > > Nope, this will not change the allocation failure mode. Reclaim
> > > constrains do not change the failure mode they just change how much the
> > > allocation might struggle to reclaim to succeed. 
> > >
> > > My undocumented assumption (another dept on my end) is that pcp
> > > allocations are no hot paths. So the worst case is that GFP_KERNEL
> > > pcp_allocation could have been satisfied _easier_ (i.e. faster) because
> > > it could have reclaimed fs/io caches and now it needs to rely on kswapd
> > > to do that on memory tight situations. On the other hand we have a
> > > situation when NOIO/FS allocations fail prematurely so there is
> > > certainly some pros and cons.
> > 
> > I'm having a hard time following. Are you saying that it won't increase the
> > likelihood of allocation failures even under memory pressure but that it
> > might just make allocations take longer to succeed?
> > 
> > NOFS/IO prevents allocation attempt from entering fs/io reclaim paths,
> > right? It would still trigger kswapd for reclaim but can the allocation
> > attempt wait for that to finish? If so, wouldn't that constitute a
> > dependency cycle all the same?
> > 
> > All in all, percpu allocations taking longer under memory pressure is fine.
> > Becoming more prone to allocation failures, especially for GFP_KERNEL
> > callers, probably isn't great.
> > 
> 
> Wait, I think I'm interpreting this change differently. This is
> preventing the worker from allocating backing pages via GFP_KERNEL. It
> isn't preventing an allocation via alloc_percpu() from being GFP_KERNEL
> and providing those flags down to the backing page code. alloc_percpu()
> for GFP_KERNEL allocations will populate the pages before returning.

Correct.

> I'm reading this as potentially making atomic percpu allocations fail as
> we might be low on backing pages. This change makes the worker now need
> to wait for kswapd to give it pages. Consequently, if there are a lot of
> allocations coming in when it's low, we might burn a bit of cpu from the
> worker now.

Yes, this is potential side effect. On the other hand NOFS/NOIO requests
wouldn't be considered atomic anymore and they wouldn't fail that
easily. Maybe that is an odd case not worth the additional worker
overhead. As I've said I am not familiar with the pcp internals to know
how often the worker is really required

-- 
Michal Hocko
SUSE Labs