On Thu, Sep 20, 2012 at 01:53:44PM -0400, Andy Adamson wrote: > On Thu, Sep 20, 2012 at 1:47 PM, Andy Adamson <androsadamson@xxxxxxxxx> wrote: > > On Thu, Sep 20, 2012 at 12:17 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > >> On Thu, Sep 20, 2012 at 12:06:48PM -0400, Andy Adamson wrote: > >>> On Thu, Sep 20, 2012 at 10:34 AM, William Dauchy <wdauchy@xxxxxxxxx> wrote: > >>> > On Tue, Sep 18, 2012 at 11:49 AM, William Dauchy <wdauchy@xxxxxxxxx> wrote: > >>> >> I'm getting a trace following an unhandled error on a linux nfs client > >>> >> 3.4.7 x86_64. > >>> >> NFS: nfs4_reclaim_open_state: unhandled error -10026. Zeroing state > >>> > > >>> > For the moment I don't know if the error is coming from a bad server > >>> > implementation or if it's on client side. Should I assume that this an > >>> > error that should never hit the client? > >>> > >>> Yes. > >>> > >>> The client only sends OPEN reclaims after noting the server has > >>> rebooted due to previously receiving an NFS4ERR_STALE_CLIENTID or > >>> NFS4ERR_STALE_STATEID error from a state-full operation (RENEW, OPEN, > >>> OPEN_DOWNGRADE, OPEN_CONFIRM, CLOSE, LOCK, LOCKU) which triggers the > >>> client to establish a new clientid via > >>> SETCLIENTID/SETCLIENTID_CONFIRM. > >>> > >>> Upon server reboot, all state that the previous server instance had is > >>> invalid - including OPEN seqid's. So, the server returning > >>> NFS4ERR_BAD_SEQID (10026) on an OPEN reclaim is illegal. > >> > >> Wait, but couldn't there be multiple reclaims using the same open owner, > >> in which case later reclaims could in theory hit BAD_SEQID? > > > > Nope. > > > > 3530 section 9.1.6. Sequencing of Lock Requests > > > > Note that for requests that contain a sequence number, for each > > state-owner, there should be no more than one outstanding request. > > Well - I sent this too soon :) . Yes, a buggy client could send > (serialized) reclaims with a bad seqid, and get NFS4ERR_BAD_SEQ. > Tough to do with the above constraint, but possible. William, is this easy to reproduce? Would it be possible to get a network trace covering the problem? (tcpdump -s0 -wtmp.pcap, then send us tmp.pcap. And also feel free to take a look at tmp.pcap with wireshark yourself--you may be able to find the call that's returning BAD_SEQID. What we'll be curious about is what the sequence id sent on that call was, and what the sequence id was on any preceding operations using the same open owner). --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html