On Wed, Sep 22, 2021 at 03:16:34PM -0700, dai.ngo@xxxxxxxxxx wrote: > > On 9/22/21 2:14 PM, J. Bruce Fields wrote: > >On Thu, Sep 16, 2021 at 02:22:11PM -0400, Dai Ngo wrote: > >>+/* > >>+ * If the conflict happens due to a NFSv4 request then check for > >>+ * courtesy client and set rq_conflict_client so that upper layer > >>+ * can destroy the conflict client and retry the call. > >>+ */ > >>+static bool > >>+nfsd_check_courtesy_client(struct nfs4_delegation *dp) > >>+{ > >>+ struct svc_rqst *rqst; > >>+ struct nfs4_client *clp = dp->dl_recall.cb_clp; > >>+ struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id); > >>+ bool ret = false; > >>+ > >>+ if (!i_am_nfsd()) { > >>+ if (test_bit(NFSD4_COURTESY_CLIENT, &clp->cl_flags)) { > >>+ set_bit(NFSD4_DESTROY_COURTESY_CLIENT, &clp->cl_flags); > >>+ mod_delayed_work(laundry_wq, &nn->laundromat_work, 0); > >>+ return true; > >>+ } > >>+ return false; > >>+ } > >>+ rqst = kthread_data(current); > >>+ if (rqst->rq_prog != NFS_PROGRAM || rqst->rq_vers < 4) > >>+ return false; > >>+ rqst->rq_conflict_client = NULL; > >>+ > >>+ spin_lock(&nn->client_lock); > >>+ if (test_bit(NFSD4_COURTESY_CLIENT, &clp->cl_flags) && > >>+ !mark_client_expired_locked(clp)) { > >>+ rqst->rq_conflict_client = clp; > >>+ ret = true; > >>+ } > >>+ spin_unlock(&nn->client_lock); > >Check whether this is safe; I think the flc_lock may be taken inside of > >this lock elsewhere, resulting in a potential deadlock? > > > >rqst doesn't need any locking as it's only being used by this thread, so > >it's the client expiration stuff that's the problem, I guess. > > mark_client_expired_locked needs to acquire cl_lock. I think the lock > ordering is ok, client_lock -> cl_lock. nfsd4_exchange_id uses this > lock ordering. It's flc_lock (see locks.c) that I'm worried about. I've got a lockdep warning here, taking a closer look.... nfsd4_release_lockowner takes clp->cl_lock and then fcl_lock. Here we're taking fcl_lock and then client_lock. As you say, elsewhere client_lock is taken and then cl_lock. So that's the loop, I think. --b.