On Mon, Feb 26, 2018 at 11:20:49AM +0530, Raghavendra G wrote: > On Fri, Feb 23, 2018 at 6:33 AM, J. Bruce Fields <bfields@xxxxxxxxxxxx> > wrote: > > > On Thu, Feb 22, 2018 at 01:17:58PM +0530, Raghavendra G wrote: > > > For a local filesystem, we may not end up in ESTALE errors. But, when > > rmdir > > > is executed from multiple clients of a network fs (like NFS, Glusterfs), > > > unlink or rmdir can easily fail with ESTALE as the other rm invocation > > > could've deleted it. I think this is what has happened in bugs like: > > > https://bugzilla.redhat.com/show_bug.cgi?id=1546717 > > > https://bugzilla.redhat.com/show_bug.cgi?id=1245065 > > > > > > This in fact was the earlier motivation to convert ESTALE into ENOENT, so > > > that rm would ignore it. Now that I reverted the fix, looks like the bug > > > has promptly resurfaced :) > > > > > > There is one glitch though. Bug 1245065 mentions that some parts of > > > directory structure remain undeleted. From my understanding, atleast one > > > instance of rm (which is racing ahead of all others causing others to > > > fail), should've delted the directory structure completely. Though, I > > need > > > to understand the directory traversal done by rm to find whether there > > are > > > cyclic dependency between two rms causing both of them to fail. > > > > I don't see how you could avoid that. The clients are each caching > > multiple subdirectories of the tree, and there's no guarantee that 1 > > client has fresher caches of every subdirectory. There's also no > > guarantee that the client that's ahead stays ahead--another client that > > sees which objects the first client has already deleted can leapfrog > > ahead. > > > > What are the drawbacks of applications (like rm) treating ESTALE equivalent > of ENOENT? It seems to me, from the application perspective they both > convey similar information. If rm could ignore ESTALE just like it does for > ENOENT, probably we don't run into this issue. That might work. Or, maybe better, take "ESTALE" as a sign that the parent directory is gone and give up on trying to remove further entries from it. Could you remind me why this is a priority, anyway? A quick look at the bz's suggest they're both artificial tests. Were they were motivated by a customer problem originally? Apologies if we've already been over this.... --b. _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-devel