Re: [PATCH] dcache: return -ESTALE not -EBUSY on distributed fs race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I keep looking back at this problem trying to think of a better idea.
Most recently I remember Jeff's work to retry ESTALE's and wondered if
we could take advantage of that.

It's still kind of a bandaid, but it's the only thing I've thought of
that at least helps a little and isn't a huge pain.  Any other ideas?

--b.

On Wed, Dec 17, 2014 at 02:59:11PM -0500, bfields wrote:
> From: "J. Bruce Fields" <bfields@xxxxxxxxxx>
> 
> On a distributed filesystem it's possible for lookup to discover that a
> directory it just found is already cached elsewhere in the directory
> heirarchy.  The dcache won't let us keep the directory in both places,
> so we have to move the dentry to the new location from the place we
> previously had it cached.
> 
> If the parent has changed, then this requires all the same locks as we'd
> need to do a cross-directory rename.  But we're already in lookup
> holding one parent's i_mutex, so it's too late to acquire those locks in
> the right order.
> 
> The (unreliable) solution in __d_unalias is to trylock() the required
> locks and return -EBUSY if it fails.
> 
> I see no particular reason for returning -EBUSY, and -ESTALE is already
> the result of some other lookup races on NFS.  I think -ESTALE is the
> more helpful error return.  It also allows us to take advantage of the
> logic Jeff Layton added in c6a9428401c0 "vfs: fix renameat to retry on
> ESTALE errors" and ancestors, which hopefully resolves some of these
> errors before they're returned to userspace.
> 
> I can reproduce these cases using NFS with:
> 
> 	ssh root@$client '
> 		mount -olookupcache=pos '$server':'$export' /mnt/
> 		mkdir /mnt/TO
> 		mkdir /mnt/DIR
> 		touch /mnt/DIR/test.txt
> 		while true; do
> 			strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY
> 		done
> 	'
> 	ssh root@$server '
> 		while true; do
> 			mv $export/DIR $export/TO/DIR
> 			mv $export/TO/DIR $export/DIR
> 		done
> 	'
> 
> It also helps to add some other concurrent use of the directory on the
> client (e.g., "ls /mnt/TO").  And you can replace the server-side mv's
> by client-side mv's that are repeatedly killed.  (If the client is
> interrupted while waiting for the RENAME response then it's left with a
> dentry that has to go under one parent or the other, but it doesn't yet
> know which.)
> 
> Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx>
> ---
>  fs/dcache.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 5bc72b07fde2..b7de7f1ae38f 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -2612,7 +2612,7 @@ static struct dentry *__d_unalias(struct inode *inode,
>  		struct dentry *dentry, struct dentry *alias)
>  {
>  	struct mutex *m1 = NULL, *m2 = NULL;
> -	struct dentry *ret = ERR_PTR(-EBUSY);
> +	struct dentry *ret = ERR_PTR(-ESTALE);
>  
>  	/* If alias and dentry share a parent, then no extra locks required */
>  	if (alias->d_parent == dentry->d_parent)
> -- 
> 1.9.3
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux