Re: [PATCH 2/4] ext4: fix race chown vs truncate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dmitry Monakhov <dmonakhov@xxxxxxxxxx> writes:
> To prevent ext4_reservation vs dquot_reservation inconsistency, we have
> to reorganize locking ordering like follows:
> i_block_reservation_lock > dqptr_sem
> This means what all functions which changes ext4 or quota reservation have
> to hold i_block_reservation_lock.
While investigating this issue i've considered other solution

* Introduce i_block analog for generic reserved space management:
We may introduce i_rsv_block field in generic inode, and protected
it by i_lock(similar to i_block).
Introduce inc/dec/set/get methods similar to inode_get_bytes,
inode_sub_bytes.. . 
This value is managed internally by quota code. Perform reservation
management inside dquot_reserve_space, dquot_release_reservation
without interfering with fs internals, as we do for i_blocks.
IMHO this is best way because:

1)We don't have to hold i_block_reservation_lock while quota-op
  which may lead to virtual performance penalty.
2)This brings to well defined VFS interface for reserved space management.

But I expect some problems from AlViro because only ext4 would use it by now.
And off course this may lead to ext4_rsv quot_rsv inconsistency
due to some bugs.
>
> Signed-off-by: Dmitry Monakhov <dmonakhov@xxxxxxxxxx>
> ---
>  fs/ext4/inode.c |   23 ++++++++++++++++++-----
>  1 files changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 84863e6..c521c93 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1047,16 +1047,23 @@ cleanup:
>  out:
>  	return err;
>  }
> +/*
> + * Quota_transfer callback.
> + * During quota transfer we have to transfer rsv-blocks from one dquot to
> + * another. inode must be protected from concurrent reservation/reclamation.
> + * Locking ordering for all space reservation code:
> + *  i_block_reservation_lock > dqptr_sem
> + * This is differ from i_block,i_lock locking ordering, but this is the
> + * only possible way.
> + * NOTE: Caller must hold i_block_reservation_lock.
> + */
>  
>  qsize_t ext4_get_reserved_space(struct inode *inode)
>  {
>  	unsigned long long total;
>  
> -	spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>  	total = EXT4_I(inode)->i_reserved_data_blocks +
>  		EXT4_I(inode)->i_reserved_meta_blocks;
> -	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
> -
>  	return (total << inode->i_blkbits);
>  }
>  /*
> @@ -1131,6 +1138,8 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
>  	if (mdb_free)
>  		vfs_dq_release_reservation_block(inode, mdb_free);
>  
> +	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
> +
>  	/*
>  	 * If we have done all the pending block allocations and if
>  	 * there aren't any writers on the inode, we can discard the
> @@ -1866,8 +1875,8 @@ repeat:
>  	}
>  
>  	if (ext4_claim_free_blocks(sbi, total)) {
> -		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  		vfs_dq_release_reservation_block(inode, total);
> +		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
>  			yield();
>  			goto repeat;
> @@ -1924,9 +1933,9 @@ static void ext4_da_release_space(struct inode *inode, int to_free)
>  
>  	BUG_ON(mdb > EXT4_I(inode)->i_reserved_meta_blocks);
>  	EXT4_I(inode)->i_reserved_meta_blocks = mdb;
> -	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  
>  	vfs_dq_release_reservation_block(inode, release);
> +	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  }
>  
>  static void ext4_da_page_release_reservation(struct page *page,
> @@ -5436,7 +5445,11 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>  			error = PTR_ERR(handle);
>  			goto err_out;
>  		}
> +		/* i_block_reservation must being held in order to avoid races
> +		 * with concurent block reservation. */
> +		spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>  		error = vfs_dq_transfer(inode, attr) ? -EDQUOT : 0;
> +		spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>  		if (error) {
>  			ext4_journal_stop(handle);
>  			return error;
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux