Re: READ during state recovery uses zero stateid

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Wed, 24 Aug 2016 18:23:27 +0000

> On Aug 24, 2016, at 14:10, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> 
> Hi-
> 
> I have a wire capture that shows this race while a simple I/O workload is
> running:
> 
> 0. The client reconnects after a network partition
> 1. The client sends a couple of READ requests
> 2. The client independently discovers its lease has expired
> 3. The client establishes a fresh lease
> 4. The client destroys open, lock, and delegation stateids for the file
> that was open under the previous lease
> 5. The client issues a new OPEN to recover state for that file
> 6. The server replies to the READs in step 1. with NFS4ERR_EXPIRED
> 7. The client turns the READs around immediately using the current open
> stateid for that file, which is the zero stateid
> 8. The server replies NFS4_OK to the OPEN from step 5
> 
> If I understand the code correctly, if the server happened to send those
> READ replies after its OPEN reply (rather than before), the client would
> have used the recovered open stateid instead of the zero stateid when
> resending the READ requests.
> 
> Would it be better if the client recognized there is state recovery in
> progress, and then waited for recovery to complete, before retrying the
> READs?
> 

Why isn’t the session draining taking care of ensuring the READs don’t happen until after recovery is done?

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html