RE: NFS4ERR_STALE_CLIENTID loop

"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> · Sat, 29 Oct 2011 14:12:03 -0700

> -----Original Message-----
> From: David Flynn [mailto:davidf@xxxxxxxxxxxx]
> Sent: Saturday, October 29, 2011 11:08 PM
> To: Myklebust, Trond
> Cc: David Flynn; Chuck Lever; J. Bruce Fields;
linux-nfs@xxxxxxxxxxxxxxx
> Subject: Re: NFS4ERR_STALE_CLIENTID loop
> 
> * Myklebust, Trond (Trond.Myklebust@xxxxxxxxxx) wrote:
> > BAD_STATEID is a different matter, and is one that we should have
> > resolved in the NFS client in the upstream kernel. At least on newer
> > clients, we should be trying to reopen the file and re-establish all
> > locks when we get a BAD_STATEID. Can you please remind us which
kernel
> > you are using?
> 
> Ah, i see.  This was all on 3.0.0 and 3.0.4 (a quick check didn't
reveal any
> relevant changes between the two).
> 
> Are there any stable patches that can be applied to 3.0.y?
> 
> > That said... Even on new clients, the recovery attempt may fail due
to
> > the STALE_CLIENTID bug. That will still hit us when we call OPEN in
> > order to get a new stateid.
> 
> The interval between retries on that was ~1-1.5ms, could this be made
> slower? -- same questions as before really.

Given that things such as reboot recovery are subject to a recovery
window (grace period), I'm, again, very reluctant to add an artificial
backoff as that may have consequences for behaviour in the bug-free
case.

If the server wants us to back off, it can say so itself by using the
NFS4ERR_DELAY error mechanism.

Cheers
  Trond

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html