> On Mar 11, 2016, at 1:21 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > Hi- > > We observed some behavior at Connectathon between a > v4.5-rc6 Linux client and a prototype Solaris 12 server > when using NFSv4.1 on TCP. Test is xfstests generic/089. > > At an earlier point, the client has been granted a > write delegation, denoted below by "state=w", and has > closed the file. > > The sequence then observed on the wire is: > > C OPEN fh=A claim=DELEG_CUR_FH > C LOCK fh=A zero stateid > R OPEN NFS4_OK state=a > R LOCK NFS4ERR_BAD_STATEID > C TEST_STATEID state=a > R TEST_STATEID NFS4_OK > > client reports "Lock reclaim failed!" > > C LOCK fh=A state=a > R LOCK NFS4_OK > C DELEGRETURN state=w > R DELEGRETURN NFS4_OK > C RENAME -> .nfsXXXXXXXXXX > R RENAME NFS4_OK > > I've reproduced the problem here at home. Sometimes > the LOCK operation is emitted just _before_ the > OPEN. > > There are two processes involved. One is attempting > to lock the file. The other is attempting to unlink > the same file while it is still open. > > I'm not sure why the client is emitting the LOCK > operation at all, since it still holds a write > delegation. The "Lock reclaim failed!" message > seems to reflect this confusion: It expects to find > and recover lock state, but there hasn't been a > successful LOCK on that file yet. > > After browsing the code, I don't see any serialization > between taking a lock and returning a delegation on > the same file, but my understanding in this area comes > up short. > > Is there a preferred way to serialize these two > activities (like, a particular mutex that should be > held) ? > > Thanks for any guidance! Following up. Commit 24311f884 ('NFSv4: Recovery of recalled read delegations is broken') introduced a clear_bit(NFS_DELEGATED_STATE, &state->flags); in nfs_open_delegation_recall(). That flag is cleared before the OPEN is emitted, leaving a window where NFS_DELEGATED_STATE is clear, but there is no valid open stateid with which to perform a LOCK. Moving the clear_bit() into the NFS4_OK case in nfs4_handle_delegation_recall_error() seems to eliminate the race. -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html