Re: [PATCH v2] nfsd: Fix race between FREE_STATEID and LOCK

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2016-08-08 at 15:53 -0400, J. Bruce Fields wrote:
> On Mon, Aug 08, 2016 at 12:14:36PM -0400, Chuck Lever wrote:
> > 
> > 
> > > 
> > > On Aug 8, 2016, at 9:19 AM, Jeff Layton <jlayton@xxxxxxxxxx>
> > > wrote:
> > > 
> > > On Sun, 2016-08-07 at 18:22 -0400, Jeff Layton wrote:
> > > > 
> > > > On Sun, 2016-08-07 at 14:53 -0400, Chuck Lever wrote:
> > > > > 
> > > > > 
> > > > > When running LTP's nfslock01 test, the Linux client can send
> > > > > a LOCK
> > > > > and a FREE_STATEID request at the same time. The LOCK uses
> > > > > the same
> > > > > lockowner as the stateid sent in the FREE_STATEID request.
> > > > > 
> > > > > The outcome is:
> > > > > 
> > > > > Frame 115025 C FREE_STATEID stateid 2/A
> > > > > Frame 115026 C LOCK offset 672128 len 64
> > > > > Frame 115029 R FREE_STATEID NFS4_OK
> > > > > Frame 115030 R LOCK stateid 3/A
> > > 
> > > Oh, to be clear here -- I assume this a lk_is_new lock (with an
> > > open
> > > stateid in it). Right?
> > 
> >         Opcode: LOCK (12)
> >             locktype: WRITEW_LT (4)
> >             reclaim?: No
> >             offset: 672000
> >             length: 64
> >             new lock owner?: Yes
> >             seqid: 0x00000000
> >             stateid
> >                 [StateID Hash: 0x6f7e]
> >                 seqid: 0x00000002
> >                 Data: a95169579501000007000000
> >             lock_seqid: 0x00000000
> >             Owner
> >                 clientid: 0xa951695795010000
> >                 Data: <DATA>
> >                     length: 20
> >                     contents: <DATA>
> > 
> > The first appearance of that stateid is in an earlier OPEN reply:
> > 
> >         Opcode: OPEN (18)
> >             Status: NFS4_OK (0)
> >             stateid
> >                 [StateID Hash: 0x6f7e]
> >                 seqid: 0x00000002
> >                 Data: a95169579501000007000000
> >             change_info
> >                 Atomic: No
> >                 changeid (before): 0
> >                 changeid (after): 0
> >             result flags: 0x00000004, locktype posix
> >                 .... .... .... .... .... .... .... ..0. = confirm:
> > False
> >                 .... .... .... .... .... .... .... .1.. = locktype
> > posix: True
> >                 .... .... .... .... .... .... .... 0... = preserve
> > unlinked: False
> >                 .... .... .... .... .... .... ..0. .... = may
> > notify lock: False
> >             Delegation Type: OPEN_DELEGATE_NONE (0)
> 
> Oh, the client behavior makes more sense, then.
> 
> Still, did we establish for certain that the client isn't required to
> serialize here?
> 
> We'd want it fixed either way, but it'd be nice to know.
> 
> --b.
> 

I don't _think_ it is, since we aren't using a LOCK stateid at this
point. There's really nothing to serialize this against, other than
pending FREE_STATEID calls. I don't think we'd want to serialize LOCK
and FREE_STATEID though as that would prevent the client from lazily
freeing them. I think this is probably a better option.

> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > Frame 115034 C WRITE stateid 0/A offset 672128 len 64
> > > > > Frame 115038 R WRITE NFS4ERR_BAD_STATEID
> > > > > 
> > > > > In other words, the server returns stateid A in a successful
> > > > > LOCK
> > > > > reply, but it has already released it. Subsequent uses of the
> > > > > stateid fail.
> > > > > 
> > > > > To address this, protect the generation check in
> > > > > nfsd4_free_stateid
> > > > > with the st_mutex. This should guarantee that only one of two
> > > > > outcomes occurs: either LOCK returns a fresh valid stateid,
> > > > > or
> > > > > FREE_STATEID returns NFS4ERR_LOCKS_HELD.
> > > > > 
> > > > > Reported-by: Alexey Kodanev <alexey.kodanev@xxxxxxxxxx>
> > > > > Fix-suggested-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > > > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > > > > ---
> > > > >  fs/nfsd/nfs4state.c |   19 ++++++++++++-------
> > > > >  1 file changed, 12 insertions(+), 7 deletions(-)
> > > > > 
> > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > > index b921123..07dc1aa 100644
> > > > > --- a/fs/nfsd/nfs4state.c
> > > > > +++ b/fs/nfsd/nfs4state.c
> > > > > @@ -4911,19 +4911,20 @@ nfsd4_free_stateid(struct svc_rqst
> > > > > *rqstp,
> > > > > struct nfsd4_compound_state *cstate,
> > > > >  		ret = nfserr_locks_held;
> > > > >  		break;
> > > > >  	case NFS4_LOCK_STID:
> > > > > +		atomic_inc(&s->sc_count);
> > > > > +		spin_unlock(&cl->cl_lock);
> > > > > +		stp = openlockstateid(s);
> > > > > +		mutex_lock(&stp->st_mutex);
> > > > >  		ret = check_stateid_generation(stateid, &s-
> > > > > > 
> > > > > > 
> > > > > > sc_stateid, 1);
> > > > >  		if (ret)
> > > > > -			break;
> > > > > -		stp = openlockstateid(s);
> > > > > +			goto out_mutex_unlock;
> > > > >  		ret = nfserr_locks_held;
> > > > >  		if (check_for_locks(stp->st_stid.sc_file,
> > > > >  				    lockowner(stp-
> > > > > > 
> > > > > > st_stateowner)))
> > > > > -			break;
> > > > > -		WARN_ON(!unhash_lock_stateid(stp));
> > > > > -		spin_unlock(&cl->cl_lock);
> > > > > -		nfs4_put_stid(s);
> > > > > +			goto out_mutex_unlock;
> > > > > +		release_lock_stateid(stp);
> > > > >  		ret = nfs_ok;
> > > > > -		goto out;
> > > > > +		goto out_mutex_unlock;
> > > > >  	case NFS4_REVOKED_DELEG_STID:
> > > > >  		dp = delegstateid(s);
> > > > >  		list_del_init(&dp->dl_recall_lru);
> > > > > @@ -4937,6 +4938,10 @@ out_unlock:
> > > > >  	spin_unlock(&cl->cl_lock);
> > > > >  out:
> > > > >  	return ret;
> > > > > +out_mutex_unlock:
> > > > > +	mutex_unlock(&stp->st_mutex);
> > > > > +	nfs4_put_stid(s);
> > > > > +	goto out;
> > > > >  }
> > > > >  
> > > > >  static inline int
> > > > > 
> > > > >  
> > > > 
> > > > Looks good to me.
> > > > 
> > > > Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > 
> > > Hmm...I think this is not a complete fix though. We also need
> > > something
> > > like this patch:
> > 
> > OK, I'll create a series and add this patch.
> > 
> > 
> > > 
> > > --------------[snip]---------------
> > > 
> > > [PATCH] nfsd: don't return an already-unhashed lock stateid after
> > > taking mutex
> > > 
> > > nfsd4_lock will take the st_mutex before working with the stateid
> > > it
> > > gets, but between the time when we drop the cl_lock and take the
> > > mutex,
> > > the stateid could become unhashed (a'la FREE_STATEID). If that
> > > happens
> > > the lock stateid returned to the client will be forgotten.
> > > 
> > > Fix this by first moving the st_mutex acquisition into
> > > lookup_or_create_lock_state. Then, have it check to see if the
> > > lock
> > > stateid is still hashed after taking the mutex. If it's not, then
> > > put
> > > the stateid and try the find/create again.
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > ---
> > > fs/nfsd/nfs4state.c | 25 ++++++++++++++++++++-----
> > > 1 file changed, 20 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 5d6a28af0f42..1235b1661703 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -5653,7 +5653,7 @@ static __be32
> > > lookup_or_create_lock_state(struct nfsd4_compound_state *cstate,
> > > 			    struct nfs4_ol_stateid *ost,
> > > 			    struct nfsd4_lock *lock,
> > > -			    struct nfs4_ol_stateid **lst, bool
> > > *new)
> > > +			    struct nfs4_ol_stateid **plst, bool
> > > *new)
> > > {
> > > 	__be32 status;
> > > 	struct nfs4_file *fi = ost->st_stid.sc_file;
> > > @@ -5661,7 +5661,9 @@ lookup_or_create_lock_state(struct
> > > nfsd4_compound_state *cstate,
> > > 	struct nfs4_client *cl = oo->oo_owner.so_client;
> > > 	struct inode *inode = d_inode(cstate->current_fh.fh_dentry);
> > > 	struct nfs4_lockowner *lo;
> > > +	struct nfs4_ol_stateid *lst;
> > > 	unsigned int strhashval;
> > > +	bool hashed;
> > > 
> > > 	lo = find_lockowner_str(cl, &lock->lk_new_owner);
> > > 	if (!lo) {
> > > @@ -5677,12 +5679,27 @@ lookup_or_create_lock_state(struct
> > > nfsd4_compound_state *cstate,
> > > 			goto out;
> > > 	}
> > > 
> > > -	*lst = find_or_create_lock_stateid(lo, fi, inode, ost,
> > > new);
> > > -	if (*lst == NULL) {
> > > +retry:
> > > +	lst = find_or_create_lock_stateid(lo, fi, inode, ost,
> > > new);
> > > +	if (lst == NULL) {
> > > 		status = nfserr_jukebox;
> > > 		goto out;
> > > 	}
> > > +
> > > +	mutex_lock(&lst->st_mutex);
> > > +
> > > +	/* See if it's still hashed to avoid race with
> > > FREE_STATEID */
> > > +	spin_lock(&cl->cl_lock);
> > > +	hashed = list_empty(&lst->st_perfile);
> > > +	spin_unlock(&cl->cl_lock);
> > > +
> > > +	if (!hashed) {
> > > +		mutex_unlock(&lst->st_mutex);
> > > +		nfs4_put_stid(&lst->st_stid);
> > > +		goto retry;
> > > +	}
> > > 	status = nfs_ok;
> > > +	*plst = lst;
> > > out:
> > > 	nfs4_put_stateowner(&lo->lo_owner);
> > > 	return status;
> > > @@ -5752,8 +5769,6 @@ nfsd4_lock(struct svc_rqst *rqstp, struct
> > > nfsd4_compound_state *cstate,
> > > 			goto out;
> > > 		status = lookup_or_create_lock_state(cstate, open_stp,
> > > lock,
> > > 							&lock_stp,
> > > &new);
> > > -		if (status == nfs_ok)
> > > -			mutex_lock(&lock_stp->st_mutex);
> > > 	} else {
> > > 		status = nfs4_preprocess_seqid_op(cstate,
> > > 				       lock->lk_old_lock_seqid,
> > > -- 
> > > 2.7.4
> > 
> > --
> > Chuck Lever
> > 
> > 
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-
> > nfs" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux