On Mon, 2019-04-08 at 12:41 -0400, J. Bruce Fields wrote: > On Mon, Apr 08, 2019 at 03:20:48PM +0000, Trond Myklebust wrote: > > On Fri, 2019-04-05 at 08:54 -0700, Trond Myklebust wrote: > > > If there are multiple callbacks queued, waiting for the callback > > > slot when the callback gets shut down, then they all currently > > > end up acting as if they hold the slot, and call > > > nfsd4_cb_sequence_done() resulting in interesting side-effects. > > > > > > In addition, the 'retry_nowait' path in nfsd4_cb_sequence_done() > > > causes a loop back to nfsd4_cb_prepare() without first freeing > > > the > > > slot, which causes a deadlock when nfsd41_cb_get_slot() gets > > > called > > > a second time. > > > > > > This patch therefore adds a boolean to track whether or not the > > > callback did pick up the slot, so that it can do the right thing > > > in these 2 cases. > > > > > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > --- > > > v2: try to restart the callback if we hit > > > nfsd4_cb_sequence_done() > > > without a slot. > > > > > > > > > > Hi Bruce, > > > > Should this patch perhaps be considered for stable? The callback > > slot > > leak is permanent (or at least for the lifetime of the > > nfs4_client). > > Makes sense to me; I'll queue it up for 5.1 and stable. > > What were the original symptoms that prompted this? > We were using the NFSv4.2 client for doing CLONE against a knfsd server, and eventually ended up with a bunch of CB_RECALL requests that were seen hanging forever on the server. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx