Re: [PATCH] SUNRPC: use congestion_wait() in svc_alloc_args()

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Tue, 7 Sep 2021 14:53:48 +0000

> On Sep 6, 2021, at 8:41 PM, NeilBrown <neilb@xxxxxxx> wrote:
> 
> When does a single-page GFP_KERNEL allocation fail?  Ever?
> 
> I know that if I add __GFP_NOFAIL then it won't fail and that is
> preferred to looping.
> I know that if I add __GFP_RETRY_MAILFAIL (or others) then it might
> fail.
> But that is the semantics for a plain GFP_KERNEL ??
> 
> I recall a suggestion one that it would only fail if the process was
> being killed by the oom killer.  That seems reasonable and would suggest
> that retrying is really bad.  Is that true?
> 
> For svc_alloc_args(), it might be better to fail and have the calling
> server thread exit.  This would need to be combined with dynamic
> thread-count management so that when a request arrived, a new thread
> might be started.

I don't immediately see a benefit to killing server threads
during periods of memory exhaustion, but sometimes I lack
imagination.

> So maybe we really don't want reclaim_progress_wait(), and all current
> callers of congestion_wait() which are just waiting for allocation to
> succeed should be either change to use __GFP_NOFAIL, or to handle
> failure.

I had completely forgotten about GFP_NOFAIL. That seems like the
preferred answer, as it avoids an arbitrary time-based wait for
a memory resource. (And maybe svc_rqst_alloc() should also get
the NOFAIL treatment?).

The question in my mind is how the new alloc_pages_bulk() will
behave when passed NOFAIL. I would expect that NOFAIL would simply
guarantee that at least one page is allocated on every call.

--
Chuck Lever