I keep looking back at this problem trying to think of a better idea. Most recently I remember Jeff's work to retry ESTALE's and wondered if we could take advantage of that. It's still kind of a bandaid, but it's the only thing I've thought of that at least helps a little and isn't a huge pain. Any other ideas? --b. On Wed, Dec 17, 2014 at 02:59:11PM -0500, bfields wrote: > From: "J. Bruce Fields" <bfields@xxxxxxxxxx> > > On a distributed filesystem it's possible for lookup to discover that a > directory it just found is already cached elsewhere in the directory > heirarchy. The dcache won't let us keep the directory in both places, > so we have to move the dentry to the new location from the place we > previously had it cached. > > If the parent has changed, then this requires all the same locks as we'd > need to do a cross-directory rename. But we're already in lookup > holding one parent's i_mutex, so it's too late to acquire those locks in > the right order. > > The (unreliable) solution in __d_unalias is to trylock() the required > locks and return -EBUSY if it fails. > > I see no particular reason for returning -EBUSY, and -ESTALE is already > the result of some other lookup races on NFS. I think -ESTALE is the > more helpful error return. It also allows us to take advantage of the > logic Jeff Layton added in c6a9428401c0 "vfs: fix renameat to retry on > ESTALE errors" and ancestors, which hopefully resolves some of these > errors before they're returned to userspace. > > I can reproduce these cases using NFS with: > > ssh root@$client ' > mount -olookupcache=pos '$server':'$export' /mnt/ > mkdir /mnt/TO > mkdir /mnt/DIR > touch /mnt/DIR/test.txt > while true; do > strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY > done > ' > ssh root@$server ' > while true; do > mv $export/DIR $export/TO/DIR > mv $export/TO/DIR $export/DIR > done > ' > > It also helps to add some other concurrent use of the directory on the > client (e.g., "ls /mnt/TO"). And you can replace the server-side mv's > by client-side mv's that are repeatedly killed. (If the client is > interrupted while waiting for the RENAME response then it's left with a > dentry that has to go under one parent or the other, but it doesn't yet > know which.) > > Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx> > --- > fs/dcache.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index 5bc72b07fde2..b7de7f1ae38f 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -2612,7 +2612,7 @@ static struct dentry *__d_unalias(struct inode *inode, > struct dentry *dentry, struct dentry *alias) > { > struct mutex *m1 = NULL, *m2 = NULL; > - struct dentry *ret = ERR_PTR(-EBUSY); > + struct dentry *ret = ERR_PTR(-ESTALE); > > /* If alias and dentry share a parent, then no extra locks required */ > if (alias->d_parent == dentry->d_parent) > -- > 1.9.3 > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html