On May 25, 2012, at 9:46 AM, J. Bruce Fields wrote: > On Thu, May 24, 2012 at 03:29:51PM -0400, Chuck Lever wrote: >> The error values that TEST_STATEID is allowed to return does not >> include NFS4ERR_STALE_STATEID. In addition, RFC 5661 says: >> >> 15.1.16.5. NFS4ERR_STALE_STATEID (Error Code 10023) >> >> A stateid generated by an earlier server instance was used. This >> error is moot in NFSv4.1 because all operations that take a stateid >> MUST be preceded by the SEQUENCE operation, and the earlier server >> instance is detected by the session infrastructure that supports >> SEQUENCE. >> >> I triggered the NFS4ERR_STALE_STATEID during nograce recovery testing. >> My client had updated its boot verifier, so the server instance hadn't >> changed, but the client instance had. Thus the server allowed the >> SEQUENCE operation, but returned NFS4ERR_STALE_STATEID on the >> TEST_STATEID operation. >> >> After a client's lease expires, TEST_STATEID should report >> NFS4ERR_EXPIRED for state IDs that the client tries to recover. I >> don't see a way to make that happen, though. > > After the client's lease expires, the SEQUENCE operation will fail. > > (Which I believe to be a valid, if unforgiving, server implementation. > If we were to implement "courtesy locks" in this case, I believe we'd > remember the clientid for longer, permit the SEQUENCE, and fail > individual stateid's with EXPIRED as appropriate?) > >> Finally, RFC 5661, section 18.48.3 has this: >> >> o Special stateids are always considered invalid (they result in the >> error code NFS4ERR_BAD_STATEID). > > Thanks for the explanation! > >> >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> >> --- >> >> Bruce, would you consider taking something like this? > > Sure; nits: > >> fs/nfsd/nfs4state.c | 7 ++++--- >> 1 files changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >> index 9235cfa..ae1fab3 100644 >> --- a/fs/nfsd/nfs4state.c >> +++ b/fs/nfsd/nfs4state.c >> @@ -3365,12 +3365,13 @@ __be32 nfs4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid) >> struct nfs4_ol_stateid *ols; >> __be32 status; >> >> + if (ZERO_STATEID(stateid) || ONE_STATEID(stateid)) >> + return nfserr_bad_stateid; > > Or inval? This is just a buggy client. RFC 5661 says specifically that TEST_STATEID should return NFS4ERR_BAD_STATEID in this case. > >> if (STALE_STATEID(stateid)) >> - return nfserr_stale_stateid; >> - >> + return nfserr_bad_stateid; > > Again, this is just a buggy client, since we shouldn't have gotten past > the SEQUENCE in this case unless the client's sending a stateid that's > actually someone else's. In my nograce test, the server does get past the SEQUENCE. So maybe there's a bug there too? > If you think it's worth checking for those buggy client cases, we could > instaed check that stateid->si_opaque.so_clid and cl->clientid agree. > That'd cover the special-stateid checks too. That's interesting. I can redrive this patch with this change. Yes, I think checking for buggy clients is a good idea. > >> s = find_stateid(cl, stateid); >> if (!s) >> - return nfserr_stale_stateid; >> + return nfserr_bad_stateid; > > So this must be the case you actually hit. Agreed with this change. > > > --b. > >> status = check_stateid_generation(stateid, &s->sc_stateid, 1); >> if (status) >> return status; >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html