Re: [PATCH 3/4] nfsd: replace rp_mutex to avoid deadlock in move_to_close_lru()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 05, 2024 at 09:36:29AM +1100, NeilBrown wrote:
> On Tue, 05 Mar 2024, Chuck Lever wrote:
> > On Tue, Mar 05, 2024 at 08:45:45AM +1100, NeilBrown wrote:
> > > On Tue, 05 Mar 2024, Chuck Lever wrote:
> > > > On Mon, Mar 04, 2024 at 03:40:21PM +1100, NeilBrown wrote:
> > > > > move_to_close_lru() waits for sc_count to become zero while holding
> > > > > rp_mutex.  This can deadlock if another thread holds a reference and is
> > > > > waiting for rp_mutex.
> > > > > 
> > > > > By the time we get to move_to_close_lru() the openowner is unhashed and
> > > > > cannot be found any more.  So code waiting for the mutex can safely
> > > > > retry the lookup if move_to_close_lru() has started.
> > > > > 
> > > > > So change rp_mutex to an atomic_t with three states:
> > > > > 
> > > > >  RP_UNLOCK   - state is still hashed, not locked for reply
> > > > >  RP_LOCKED   - state is still hashed, is locked for reply
> > > > >  RP_UNHASHED - state is not hashed, no code can get a lock.
> > > > > 
> > > > > Use wait_var_event() to wait for either a lock, or for the owner to be
> > > > > unhashed.  In the latter case, retry the lookup.
> > > > > 
> > > > > Signed-off-by: NeilBrown <neilb@xxxxxxx>
> > > > > ---
> > > > >  fs/nfsd/nfs4state.c | 38 +++++++++++++++++++++++++++++++-------
> > > > >  fs/nfsd/state.h     |  2 +-
> > > > >  2 files changed, 32 insertions(+), 8 deletions(-)
> > > > > 
> > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > > index 690d0e697320..47e879d5d68a 100644
> > > > > --- a/fs/nfsd/nfs4state.c
> > > > > +++ b/fs/nfsd/nfs4state.c
> > > > > @@ -4430,21 +4430,32 @@ nfsd4_init_leases_net(struct nfsd_net *nn)
> > > > >  	atomic_set(&nn->nfsd_courtesy_clients, 0);
> > > > >  }
> > > > >  
> > > > > +enum rp_lock {
> > > > > +	RP_UNLOCKED,
> > > > > +	RP_LOCKED,
> > > > > +	RP_UNHASHED,
> > > > > +};
> > > > > +
> > > > >  static void init_nfs4_replay(struct nfs4_replay *rp)
> > > > >  {
> > > > >  	rp->rp_status = nfserr_serverfault;
> > > > >  	rp->rp_buflen = 0;
> > > > >  	rp->rp_buf = rp->rp_ibuf;
> > > > > -	mutex_init(&rp->rp_mutex);
> > > > > +	atomic_set(&rp->rp_locked, RP_UNLOCKED);
> > > > >  }
> > > > >  
> > > > > -static void nfsd4_cstate_assign_replay(struct nfsd4_compound_state *cstate,
> > > > > -		struct nfs4_stateowner *so)
> > > > > +static int nfsd4_cstate_assign_replay(struct nfsd4_compound_state *cstate,
> > > > > +				      struct nfs4_stateowner *so)
> > > > >  {
> > > > >  	if (!nfsd4_has_session(cstate)) {
> > > > > -		mutex_lock(&so->so_replay.rp_mutex);
> > > > > +		wait_var_event(&so->so_replay.rp_locked,
> > > > > +			       atomic_cmpxchg(&so->so_replay.rp_locked,
> > > > > +					      RP_UNLOCKED, RP_LOCKED) != RP_LOCKED);
> > > > 
> > > > What reliably prevents this from being a "wait forever" ?
> > > 
> > > That same thing that reliably prevented the original mutex_lock from
> > > waiting forever.
> > > It waits until rp_locked is set to RP_UNLOCKED, which is precisely when
> > > we previously called mutex_unlock.  But it *also* aborts the wait if
> > > rp_locked is set to RP_UNHASHED - so it is now more reliable.
> > > 
> > > Does that answer the question?
> > 
> > Hm. I guess then we are no worse off with wait_var_event().
> > 
> > I'm not as familiar with this logic as perhaps I should be. How long
> > does it take for the wake-up to occur, typically?
> 
> wait_var_event() is paired with wake_up_var().
> The wake up happens when wake_up_var() is called, which in this code is
> always immediately after atomic_set() updates the variable.

I'm trying to ascertain the actual wall-clock time that the nfsd thread
is sleeping, at most. Is this going to be a possible DoS vector? Can
it impact the ability for the server to shut down without hanging?


-- 
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux