On Sun, Apr 15, 2012 at 09:03:23PM +0200, Bernd Schubert wrote: > On 04/13/2012 05:42 PM, Jeff Layton wrote: > > (note: please don't trim the CC list!) > > > > Indefinitely does make some sense (as Peter articulated in his original > > set). It's possible you could race several times in a row, or a server > > misconfiguration or something has happened and you have a transient > > error that will eventually recover. His assertion was that any limit on > > the number of retries is by definition wrong. For NFS, a fatal signal > > ought to interrupt things as well, so retrying indefinitely has some > > appeal there. > > > > OTOH, we do have to contend with filesystems that might return ESTALE > > persistently for other reasons and that might not respond to signals. > > Miklos pointed out that some FUSE fs' do this in his review of Peter's > > set. > > > > As a purely defensive coding measure, limiting the number of retries to > > something finite makes sense. If we're going to do that though, I'd > > probably recommend that we set the number of retries be something > > higher just so that this is more resilient in the face of multiple > > races. Those other fs' might "spin" a bit in that case but it is an > > error condition and IMO resiliency trumps performance -- at least in > this case. > > I am definitely voting against an infinite number of retries. I'm > working on FhGFS, which supports distributed meta data servers. So when > a file is moved around between directories, its file handle, which > contains the meta-data target id might become invalid. As NFSv3 is > stateless we cannot inform the client about that and must return ESTALE > then. Note we're not talking about retrying the operation that returned ESTALE with the same filehandle--probably any server would return ESTALE again in that case. We're talking about re-looking up the path (in the case where we're implementing a system call that takes a path as an argument), and then retrying the operation with the newly looked-up filehandle. --b. > NFSv4 is better, but I'm not sure how well invalidating a file > handle works. So retrying once on ESTALE might be a good idea, but > retrying forever is not. > Also, what about asymmetric HA servers? I believe to remember that also > resulted in ESTALE. So for example server1 exports /home and /scratch, > but on failure server2 can only take over /home and denies access to > /scratch. > > > Thanks, > Bernd > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html