On Tue, 24 Apr 2012 17:54:56 +0200 Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > Jeff Layton <jlayton@xxxxxxxxxx> writes: > > > On Mon, 23 Apr 2012 16:38:00 -0400 > > Peter Staubach <pstaubach@xxxxxxxxxxx> wrote: > > > >> I don't really like the idea of introducing another errno as well. It seems like too much complexity and represents complexity that no one has really justified needing. > >> > > > > I tend to agree here. Miklos, can you elaborate a bit on what fuse > > filesystems you're particularly concerned about here? Which ones return > > ESTALE and under what conditions. Maybe we can try to tailor this > > solution to avoid the complexity without impacting them. > > It is not just fuse I'm concerned about. Grep for -ESTALE, there are > about 120 hits about 20 of which come from NFS. There's no guarantee > that any of those ESTALE errors will go away on retry, which for an > unlimited retry means a hung OS. If you limit the number of retries > then in the best case it's just lots of wasted CPU cycles. > Yes, that's one of the first things I did when I went to look at this problem. Almost all of those are in export_operations related code and would never be returned on a path based syscall. Others are not in fs-related code at all (another subsystem has repurposed the error code), or are in fs-related code but are using it internally for other purposes (JFS seems to have some of this). While I don't like to waste CPU cycles, this is an error condition and I don't think we're well served in optimizing for it. > And an audit would still not ensure safety against future additions of > ESTALE. > Well, nothing is safe from the future. It's incumbent upon us to review patches and such before such breakage goes in. > And a simple audit won't find things like fuse, where the error comes > from outside the kernel. Fixing that is not trivial either. Turning > ESTALE into some other error prevents looping but breaks the return > value. > Then that fs is just plain returning the wrong error, IMO. We're not breaking any kABI guarantees with this -- they're still able to return ESTALE, it's just that the behavior on such a return is more resilient if we reattempt. But, let's say for the purposes of argument that we do have a fs (FUSE or otherwise) that is persistently returning ESTALE on a lookup. Why was Peter's check that we were making forward progress not enough to guard against this problem? In particular, I'm talking about the code he added to link_path_walk in this patch to check that the value of nd->path.dentry was changing: https://lkml.org/lkml/2008/3/10/266 It seems like that ought to be enough to alleviate your fears on this. We could also check for fatal signals on each pass and that would allow users to break out of the loop even when the underlying fs doesn't handle signals properly. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html