Re: [PATCH 1/1] NFSD: cancel CB_RECALL_ANY call when nfs4_client is about to be destroyed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2024-04-01 at 09:34 -0400, Chuck Lever wrote:
> On Mon, Apr 01, 2024 at 08:49:49AM -0400, Jeff Layton wrote:
> > On Sat, 2024-03-30 at 16:30 -0700, Dai Ngo wrote:
> > > On 3/30/24 11:28 AM, Chuck Lever wrote:
> > > > On Sat, Mar 30, 2024 at 10:46:08AM -0700, Dai Ngo wrote:
> > > > > On 3/29/24 4:42 PM, Chuck Lever wrote:
> > > > > > On Fri, Mar 29, 2024 at 10:57:22AM -0700, Dai Ngo wrote:
> > > > > > > On 3/29/24 7:55 AM, Chuck Lever wrote:
> > > > > > It could be straightforward, however, to move the callback_wq into
> > > > > > struct nfs4_client so that each client can have its own workqueue.
> > > > > > Then we can take some time and design something less brittle and
> > > > > > more scalable (and maybe come up with some test infrastructure so
> > > > > > this stuff doesn't break as often).
> > > > > IMHO I don't see why the callback workqueue has to be different
> > > > > from the laundry_wq or nfsd_filecache_wq used by nfsd.
> > > > You mean, you don't see why callback_wq has to be ordered, while
> > > > the others are not so constrained? Quite possibly it does not have
> > > > to be ordered.
> > > 
> > > Yes, I looked at the all the nfsd4_callback_ops on nfsd and they
> > > seems to take into account of concurrency and use locks appropriately.
> > > For each type of work I don't see why one work has to wait for
> > > the previous work to complete before proceed.
> > > 
> > > > But we might have lost the bit of history that explains why, so
> > > > let's be cautious about making broad changes here until we have a
> > > > good operational understanding of the code and some robust test
> > > > cases to check any changes we make.
> > > 
> > > Understand, you make the call.
> > 
> > commit 88382036674770173128417e4c09e9e549f82d54
> > Author: J. Bruce Fields <bfields@xxxxxxxxxxxx>
> > Date:   Mon Nov 14 11:13:43 2016 -0500
> > 
> >     nfsd: update workqueue creation
> >     
> >     No real change in functionality, but the old interface seems to be
> >     deprecated.
> >     
> >     We don't actually care about ordering necessarily, but we do depend on
> >     running at most one work item at a time: nfsd4_process_cb_update()
> >     assumes that no other thread is running it, and that no new callbacks
> >     are starting while it's running.
> >     
> >     Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
> >     Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx>
> > 
> > 
> > ...so it may be as simple as just fixing up nfsd4_process_cb_update().
> > Allowing parallel recalls would certainly be a good thing.
> 
> Thanks for the research. I was about to do the same.
> 
> I think we do allow parallel recalls -- IIUC, callback_wq
> single-threads only the dispatch of RPC calls, not their
> processing. Note the use of rpc_call_async().
> 
> So nfsd4_process_cb_update() is protecting modifications of
> cl_cb_client and the backchannel transport. We might wrap that in
> a mutex, for example. But I don't see strong evidence (yet) that
> this design is a bottleneck when it is working properly.
> 
> However, if for some reason, a work item sleeps, that would
> block forward progress of the work queue, and would be Bad (tm).
> 

Right. That's my main concern with the current single-threaded
workqueue. A stuck or sleeping job will block all callbacks, even to
clients that aren't related to the stuck one. The RPC layer is generally
good at not blocking when we ask it not to, but there are places where
the task can sleep (memory allocation comes to mind).

Also, the single threaded workqueue is just an artificial bottleneck. If
the server has a lot of callbacks to send, there is no need to serialize
that activity.

> 
> > That said, a workqueue per client would be a great place to start. I
> > don't see any reason to serialize callbacks to different clients.
> 
> I volunteer to take care of that for v6.10.
> 
> 

Excellent!
-- 
Jeff Layton <jlayton@xxxxxxxxxx>





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux