Re: [PATCH v2 2/7] NFSD: Re-organize nfsd_file_gc_worker()

"NeilBrown" <neilb@xxxxxxx> · Wed, 19 Feb 2025 12:20:15 +1100

On Wed, 19 Feb 2025, Dave Chinner wrote:
> On Tue, Feb 18, 2025 at 10:39:32AM -0500, cel@xxxxxxxxxx wrote:
> > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > 
> > Dave opines:
> > 
> > IMO, there is no need to do this unnecessary work on every object
> > that is added to the LRU.  Changing the gc worker to always run
> > every 2s and check if it has work to do like so:
> > 
> >  static void
> >  nfsd_file_gc_worker(struct work_struct *work)
> >  {
> > -	nfsd_file_gc();
> > -	if (list_lru_count(&nfsd_file_lru))
> > -		nfsd_file_schedule_laundrette();
> > +	if (list_lru_count(&nfsd_file_lru))
> > +		nfsd_file_gc();
> > +	nfsd_file_schedule_laundrette();
> >  }
> > 
> > means that nfsd_file_gc() will be run the same way and have the same
> > behaviour as the current code. When the system it idle, it does a
> > list_lru_count() check every 2 seconds and goes back to sleep.
> > That's going to be pretty much unnoticable on most machines that run
> > NFS servers.
> > 
> > Suggested-by: Dave Chinner <david@xxxxxxxxxxxxx>
> > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > ---
> >  fs/nfsd/filecache.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > index 909b5bc72bd3..2933cba1e5f4 100644
> > --- a/fs/nfsd/filecache.c
> > +++ b/fs/nfsd/filecache.c
> > @@ -549,9 +549,9 @@ nfsd_file_gc(void)
> >  static void
> >  nfsd_file_gc_worker(struct work_struct *work)
> >  {
> > -	nfsd_file_gc();
> > +	nfsd_file_schedule_laundrette();
> >  	if (list_lru_count(&nfsd_file_lru))
> > -		nfsd_file_schedule_laundrette();
> > +		nfsd_file_gc();
> >  }
> 
> IMO, the scheduling of new work is the wrong way around. It should
> be done on completion of gc work, not before gc work is started.
> 
> i.e. If nfsd_file_gc() is overly delayed (because load, rt preempt,
> etc), then a new gc worker will be started in 2s regardless of
> whether the currently running gc worker has completed or not.
> 
> Worse case, there's a spinlock hang bug in nfsd_file_gc(). This code
> will end up with N worker threads all spinning up in nfsd_file_gc()
> chewing up all the CPU in the system, not making any progress....
> If we schedule new work after completion of this work, then gc might
> hang but it won't slowly drag the entire system down with it.

While I agree that the enqueue is best done later rather than earlier, I
think your worst-case is over-stated.
queue_delayed_work() is a no-op if WORK_STRUCT_PENDING_BIT is still set.
A given work_struct can only be running once.
If the timer fires while nfsd_file_gc() is still running,
nfsd_filecache_laundrette will be queued to start immediately that the
currently running instance completes.  So the worst cases is that
there will always be one instance running.

Thanks,
NeilBrown