Re: [PATCH] nfsd: add scheduling point in nfsd_file_gc()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 06 Jan 2025, Chuck Lever wrote:
> On 1/5/25 6:11 PM, NeilBrown wrote:
> > Under a high NFSv3 load with lots of different file being accessed The
> > list_lru of garbage-collectable files can become quite long.
> > 
> > Asking lisT_lru_scan() to scan the whole list can result in a long
> > period during which a spinlock is held and no scheduling is possible.
> > This is impolite.
> > 
> > So only ask list_lru_scan() to scan 1024 entries at a time, and repeat
> > if necessary - calling cond_resched() each time.
> > 
> > Signed-off-by: NeilBrown <neilb@xxxxxxx>
> > ---
> >   fs/nfsd/filecache.c | 17 ++++++++++++-----
> >   1 file changed, 12 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index a1cdba42c4fa..e99a86798e86 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -543,11 +543,18 @@ nfsd_file_gc(void)
> >   {
> >   	LIST_HEAD(dispose);
> >   	unsigned long ret;
> > -
> > -	ret = list_lru_walk(&nfsd_file_lru, nfsd_file_lru_cb,
> > -			    &dispose, list_lru_count(&nfsd_file_lru));
> > -	trace_nfsd_file_gc_removed(ret, list_lru_count(&nfsd_file_lru));
> > -	nfsd_file_dispose_list_delayed(&dispose);
> > +	unsigned long cnt = list_lru_count(&nfsd_file_lru);
> > +
> > +	while (cnt > 0) {
> 
> Since @cnt is unsigned, it should be safe to use "while (cnt) {"

"while (cnt > 0) {" is equally safe and for me it emphasises that it
is a decreasing counter, not a Bool.  But different people have
different tastes.

> 
> (I might use "total" and "remaining" here -- "cnt", when said aloud,
> leaves me snickering).

"remaining" would be better than "cnt".

> 
> 
> > +		unsigned long num_to_scan = min(cnt, 1024UL);
> 
> I see long delays with fewer than 1024 items on the list. I might
> drop this number by one or two orders of magnitude. And make it a
> symbolic constant.

In that case I seriously wonder if this is where the delays are coming
from.

nfsd_file_dispose_list_delayed() does take and drop a spinlock
repeatedly (though it may not always be the same lock) and call
svc_wake_up() repeatedly - although the head of the queue might already
be woken.  We could optimise that to detect runs with the same nn and
only take the lock once, and only wake_up once.

> 
> There's another naked integer (8) in nfsd_file_net_dispose() -- how does
> that relate to this new cap? Should that also be a symbolic constant?

I don't think they relate.
The trade-off with "8" is:
  a bigger number might block an nfsd thread for longer,
    forcing serialising when the work can usefully be done in parallel.
  a smaller number might needlessly wake lots of threads
    to share out a tiny amount of work.

The 1024 is simply about "don't hold a spinlock for too long".

> 
> 
> > +		ret = list_lru_walk(&nfsd_file_lru, nfsd_file_lru_cb,
> > +				    &dispose, num_to_scan);
> > +		trace_nfsd_file_gc_removed(ret, list_lru_count(&nfsd_file_lru));
> > +		nfsd_file_dispose_list_delayed(&dispose);
> 
> I need to go back and review the function traces to see where the
> delays add up -- to make sure rescheduling here, rather than at some
> other point, is appropriate. It probably is, but my memory fails me
> these days.

I would like to see those function traces too.

> 
> 
> > +		cnt -= num_to_scan;
> > +		if (cnt)
> > +			cond_resched();
> 
> Another approach might be to poke the laundrette again and simply
> exit here, but I don't feel strongly about that.

That wouldn't work without storing the 'remaining' count somewhere.
With the current design we need to scan the whole list every 2 seconds.

Thanks,
NeilBrown


> 
> 
> > +	}
> >   }
> >   
> >   static void
> 
> 
> -- 
> Chuck Lever
> 





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux