Re: Warning from unlock_new_inode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dave,

Eric mentioned a really excellent bugfix earlier.  This must be it.

On Wed, Feb 29, 2012 at 12:49:06PM +1100, Dave Chinner wrote:
> xfs: fix inode lookup race
> 
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> When we get concurrent lookups of the same inode that is not in the
> per-AG inode cache, there is a race condition that triggers warnings
> in unlock_new_inode() indicating that we are initialising an inode
> that isn't in a the correct state for a new inode.
> 
> When we do an inode lookup via a file handle or a bulkstat, we don't
> serialise lookups at a higher level through the dentry cache (i.e.
> pathless lookup), and so we can get concurrent lookups of the same
> inode.
> 
> The race condition is between the insertion of the inode into the
> cache in the case of a cache miss and a concurrently lookup:
> 
> Thread 1			Thread 2
> xfs_iget()
>   xfs_iget_cache_miss()
>     xfs_iread()
>     lock radix tree
>     radix_tree_insert()
>     				rcu_read_lock
> 				radix_tree_lookup
> 				lock inode flags
> 				XFS_INEW not set
> 				igrab()
> 				unlock inode flags
> 				rcu_read_unlock
> 				use uninitialised inode
> 				.....
>     lock inode flags
>     set XFS_INEW
>     unlock inode flags
>     unlock radix tree
>   xfs_setup_inode()
>     inode flags = I_NEW
>     unlock_new_inode()
>       WARNING as inode flags != I_NEW
> 
> This can lead to inode corruption, inode list corruption, etc, and
> is generally a bad thing to occur.
> 
> Fix this by setting XFS_INEW before inserting the inode into the
> radix tree. This will ensure any concurrent lookup will find the new
> inode with XFS_INEW set and that forces the lookup to wait until the
> XFS_INEW flag is removed before allowing the lookup to succeed.
> 
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_iget.c |   17 +++++++++++------
>  1 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
> index 05bed2b..2467ab7 100644
> --- a/fs/xfs/xfs_iget.c
> +++ b/fs/xfs/xfs_iget.c
> @@ -350,9 +350,19 @@ xfs_iget_cache_miss(
>  			BUG();
>  	}
>  
> -	spin_lock(&pag->pag_ici_lock);
> +	/* These values _must_ be set before inserting the inode into the radix
> +	 * tree as the moment it is inserted a concurrent lookup (allowed by the
> +	 * RCU locking mechanism) can find it and that lookup must see that this
> +	 * is an inode currently under construction (i.e. that XFS_INEW is set).
> +	 * The ip->i_flags_lock that protects the XFS_INEW flag forms the
> +	 * memory barrier that ensures this detection works correctly at lookup
> +	 * time.
> +	 */
> +	xfs_iflags_set(ip, XFS_INEW);
> +	ip->i_udquot = ip->i_gdquot = NULL;
>  
>  	/* insert the new inode */
> +	spin_lock(&pag->pag_ici_lock);
>  	error = radix_tree_insert(&pag->pag_ici_root, agino, ip);
>  	if (unlikely(error)) {
>  		WARN_ON(error != -EEXIST);
> @@ -360,11 +370,6 @@ xfs_iget_cache_miss(
>  		error = EAGAIN;
>  		goto out_preload_end;
>  	}
> -
> -	/* These values _must_ be set before releasing the radix tree lock! */
							   ^^^ 
So, in this comment 'radix tree lock' refers to pag->pag_ici_lock?

And, pag_ici_lock lock provides no exclusion with radix_tree_lookup.

I believe I understand.  That isn't to say that I couldn't use a
brush-up on RCU.  Awesome.  ;)

Reviewed-by: Ben Myers <bpm@xxxxxxx>

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs


[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux