Re: [PATCH] vfs: take i_mutex on renamed file

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Mon, 5 Mar 2012 17:43:34 -0500

Al, do you see any reason this won't work?

After this, all I need to make nfsd happy is a patch to locks.c to
implement delegations, and then a few one-liners to add break_deleg
calls in the right operations.  So this is the one non-trivial change I
have to core vfs code.

All of those patches are also ready, and I'd like to merge them for 3.4.

--b.

On Mon, Mar 05, 2012 at 05:38:47PM -0500, bfields wrote:
> From: "J. Bruce Fields" <bfields@xxxxxxxxxx>
> 
> A read delegation is used by NFSv4 as a guarantee that a client can
> perform local read opens without informing the server.
> 
> The open operation takes the last component of the pathname as an
> argument, thus is also a lookup operation, and giving the client the
> above guarantee means informing the client before we allow anything that
> would change the set of names pointing to the inode.
> 
> Therefore, we need to break delegations on rename, link, and unlink.
> 
> We also need to prevent new delegations from being acquired while one of
> these operations is in progress.
> 
> We could add some completely new locking for that purpose, but it's
> simpler to use the i_mutex, since that's already taken by all the
> operations we care about.
> 
> The single exception is rename.  So, modify rename to take the i_mutex
> on the file that is being renamed.
> 
> Also fix up lockdep and Documentation/filesystems/directory-locking to
> reflect the change.
> 
> Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxx>
> ---
>  Documentation/filesystems/directory-locking |   11 ++++++-----
>  fs/namei.c                                  |    3 +++
>  include/linux/fs.h                          |    9 ++++++---
>  3 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/filesystems/directory-locking b/Documentation/filesystems/directory-locking
> index ff7b611..9edbcd2 100644
> --- a/Documentation/filesystems/directory-locking
> +++ b/Documentation/filesystems/directory-locking
> @@ -12,8 +12,8 @@ kinds of locks - per-inode (->i_mutex) and per-filesystem
>  locks victim and calls the method.
>  
>  4) rename() that is _not_ cross-directory.  Locking rules: caller locks
> -the parent, finds source and target, if target already exists - locks it
> -and then calls the method.
> +the parent, finds source and target, locks source, also locks target if
> +it already exists, and then calls the method.
>  
>  5) link creation.  Locking rules:
>  	* lock parent
> @@ -30,6 +30,7 @@ rules:
>  		fail with -ENOTEMPTY
>  	* if new parent is equal to or is a descendent of source
>  		fail with -ELOOP
> +	* lock source if it is not a directory.
>  	* if target exists - lock it.
>  	* call the method.
>  
> @@ -56,9 +57,9 @@ objects - A < B iff A is an ancestor of B.
>      renames will be blocked on filesystem lock and we don't start changing
>      the order until we had acquired all locks).
>  
> -(3) any operation holds at most one lock on non-directory object and
> -    that lock is acquired after all other locks.  (Proof: see descriptions
> -    of operations).
> +(3) locks on non-directory objects are acquired only after taking locks
> +    on their parents (which remain their parents until all locks are
> +    acquired, by (1) and (2)).  (Proof: see descriptions of operations).
>  
>  	Now consider the minimal deadlock.  Each process is blocked on
>  attempt to acquire some lock and already holds at least one lock.  Let's
> diff --git a/fs/namei.c b/fs/namei.c
> index 208c6aa..d29b3c4 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3073,6 +3073,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
>  			    struct inode *new_dir, struct dentry *new_dentry)
>  {
>  	struct inode *target = new_dentry->d_inode;
> +	struct inode *source = old_dentry->d_inode;
>  	int error;
>  
>  	error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry);
> @@ -3080,6 +3081,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
>  		return error;
>  
>  	dget(new_dentry);
> +	mutex_lock_nested(&source->i_mutex, I_MUTEX_RENAME_SOURCE);
>  	if (target)
>  		mutex_lock(&target->i_mutex);
>  
> @@ -3098,6 +3100,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
>  out:
>  	if (target)
>  		mutex_unlock(&target->i_mutex);
> +	mutex_unlock(&source->i_mutex);
>  	dput(new_dentry);
>  	return error;
>  }
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 386da09..e537b67 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -855,10 +855,12 @@ static inline int inode_unhashed(struct inode *inode)
>   * 0: the object of the current VFS operation
>   * 1: parent
>   * 2: child/target
> - * 3: quota file
> + * 3: xattr
> + * 4: quota file
> + * 5: the file being renamed (used only in rename of a non-directory)
>   *
>   * The locking order between these classes is
> - * parent -> child -> normal -> xattr -> quota
> + * parent -> child -> rename_source -> normal -> xattr -> quota
>   */
>  enum inode_i_mutex_lock_class
>  {
> @@ -866,7 +868,8 @@ enum inode_i_mutex_lock_class
>  	I_MUTEX_PARENT,
>  	I_MUTEX_CHILD,
>  	I_MUTEX_XATTR,
> -	I_MUTEX_QUOTA
> +	I_MUTEX_QUOTA,
> +	I_MUTEX_RENAME_SOURCE
>  };
>  
>  /*
> -- 
> 1.7.5.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html