Re: CLOSE/OPEN race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
> 
> On 12 Nov 2016, at 11:52, Jeff Layton wrote:
> 
> > 
> > On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
> > > 
> > > On 12 Nov 2016, at 7:54, Jeff Layton wrote:
> > > 
> > > > 
> > > > 
> > > > On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
> > > > > 
> > > > > 
> > > > > I've been seeing the following on a modified version of generic/089
> > > > > that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
> > > > > 
> > > > > 1. Client has open stateid A, sends a CLOSE
> > > > > 2. Client sends OPEN with same owner
> > > > > 3. Client sends another OPEN with same owner
> > > > > 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
> > > > > sequence 2)
> > > > > 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
> > > > > 6. Client gets a reply to CLOSE in 1
> > > > > 7. Client gets reply to OPEN in 2, stateid is B.1
> > > > > 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
> > > > > 
> > > > > The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that
> > > > > the OPEN
> > > > > response in 7 is able to update the open_stateid even though it has a
> > > > > lower
> > > > > sequence number.
> > > > > 
> > > > > I think this case could be handled by never updating the open_stateid
> > > > > if the
> > > > > stateids match but the sequence number of the new state is less than
> > > > > the
> > > > > current open_state.
> > > > > 
> > > > 
> > > > What kernel is this on?
> > > 
> > > On v4.9-rc2 with a couple fixups.  Without them, I can't test long
> > > enough to
> > > reproduce this race.  I don't think any of those are involved in this
> > > problem, though.
> > > 
> > > > 
> > > > 
> > > > Yes, that seems wrong. The client should be picking B.2 for the open
> > > > stateid to use. I think that decision of whether to take a seqid is
> > > > made
> > > > in nfs_need_update_open_stateid. The logic in there looks correct to
> > > > me
> > > > at first glance though.
> > > 
> > > nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE is
> > > unset.
> > > That's the precondition set up by steps 1-6.  Perhaps it should not
> > > update
> > > the stateid if they match but the sequence number is less, and still set
> > > NFS_OPEN_STATE once more.  That will fix _this_ case.  Are there other
> > > cases
> > > where that would be a problem?
> > > 
> > > Ben
> > 
> > That seems wrong.
> 
> I'm not sure what you mean: what seems wrong?
> 

Sorry, it seems wrong that the client would issue the LOCK with B.1
there.

> > 
> > The only close was sent in step 1, and that was for a
> > completely different stateid (A rather than B). It seems likely that
> > that is where the bug is.
> 
> I'm still not sure what point you're trying to make..
> 
> Even though the close was sent in step 1, the response wasn't processed
> until step 6..

Not really a point per-se, I was just saying where I think the bug might
be...

When you issue a CLOSE, you issue it vs. a particular stateid (stateid
"A" in this case). Once the open stateid has been superseded by "B", the
closing of "A" should have no effect.

Perhaps nfs_clear_open_stateid needs to check and see whether the open
stateid has been superseded before doing its thing?

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux