Re: [PATCH] ext4: brelse all indirect buffers in ext4_ind_remove_space()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/3/14 17:54, Jan Kara Wrote:
> On Thu 14-03-19 10:19:35, zhangyi (F) wrote:
>> On 2019/3/14 1:34, Jan Kara Wrote:
>>> On Sat 09-03-19 21:19:07, zhangyi (F) wrote:
>>>> All indirect buffers get by ext4_find_shared() should be released no
>>>> mater the branch should be freed or not. But now, we forget to release
>>>> the lower depth indirect buffers when removing space from the same
>>>> higher depth indirect block. It will lead to buffer leak and futher
>>>> more, it may lead to quota information corruption when using old quota,
>>>> consider the following case.
>>>>
>>>>  - Create and mount an empty ext4 filesystem without extent and quota
>>>>    features,
>>>>  - quotacheck and enable the user & group quota,
>>>>  - Create some files and write some data to them, and then punch hole
>>>>    to some files of them, it may trigger the buffer leak problem
>>>>    mentioned above.
>>>>  - Disable quota and run quotacheck again, it will create two new
>>>>    aquota files and write the checked quota information to them, which
>>>>    probably may reuse the freed indirect block(the buffer and page
>>>>    cache was not freed) as data block.
>>>>  - Enable quota again, it will invoke
>>>>    vfs_load_quota_inode()->invalidate_bdev() to try to clean unused
>>>>    buffers and pagecache. Unfortunately, because of the buffer of quota
>>>>    data block is still referenced, quota code cannot read the up to date
>>>>    quota info from the device and lead to quota information corruption.
>>>>
>>>> This problem can be reproduced by xfstests generic/231 on ext3 file
>>>> system or ext4 filesystem without extent and quota feature.
>>>>
>>>> This patch fix this problem by brelse all indirect buffers, and also
>>>> do some cleanup of the brelse code in ext4_ind_remove_space().
>>>>
>>>> Reported-by: Hulk Robot <hulkci@xxxxxxxxxx>
>>>> Signed-off-by: zhangyi (F) <yi.zhang@xxxxxxxxxx>
>>>
>>> Thanks for the report and the patch! I think it is correct but I'd find it
>>> simpler (not only the patch as such but also the resulting code in the
>>> function) to just fix that one place that seems to leak bh references.
>>> Something like:
>>>
>>>                 if (partial > chain && partial2 > chain2 &&
>>>                     partial->bh->b_blocknr == partial2->bh->b_blocknr) {
>>>                         /*
>>>                          * We've converged on the same block. Clear the range,
>>>                          * then we're done.
>>>                          */
>>>                         ext4_free_branches(handle, inode, partial->bh,
>>>                                            partial->p + 1,
>>>                                            partial2->p,
>>>                                            (chain+n-1) - partial);
>>> -                       BUFFER_TRACE(partial->bh, "call brelse");
>>> -                       brelse(partial->bh);
>>> -                       BUFFER_TRACE(partial2->bh, "call brelse");
>>> -                       brelse(partial2->bh);
>>> +			while (partial > chain) {
>>> +				BUFFER_TRACE(partial->bh, "call brelse");
>>> +				brelse(partial->bh);
>>> +			}
>>> +			while (partial2 > chain2) {
>>> +				BUFFER_TRACE(partial2->bh, "call brelse");
>>> +				brelse(partial2->bh);
>>> +			}
>>>                         return 0;
>>>                 }
>>>
>>> Or is there some other problem I miss?
>>>
>>
>> Hi Jan, thanks for your suggestion. There are no other problems, and the
>> code you suggest can fix this bh references leak correctly.
>>
>> But now we put the release code everywhere in ext4_ind_remove_space(), I
>> think it's fragile and hard to read & maintain, so I not only want to fix
>> this problem but also do some cleanup in this patch(maybe split to a single
>> patch will be better). Suggestions?
> 
> Yes, I understood that you also wanted to cleanup stuff. Firstly, it would
> be good to separate the real fix from the cleanup as you mention (to make
> backports easier although in this case it's not a huge deal). Second, I'm
> not convinced that the new code is that easier to verify - previously we've
> released bh whenever we were done with it (so that we don't leave any
> unreleased bh's beyond 'partial'), with your cleanup we postpone all the
> releasing to the end. Both options are fine with me and I don't find either
> strictly better than the other. But you can submit that cleanup as a separate
> patch and Ted can decide whether he likes it or not.
> 

Thanks for your suggestion, I will split this patch and post v2.

Thanks,
Yi.

>>
>>>> ---
>>>>  fs/ext4/indirect.c | 43 ++++++++++++++++++++++---------------------
>>>>  1 file changed, 22 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
>>>> index bf7fa15..6f3f7d5 100644
>>>> --- a/fs/ext4/indirect.c
>>>> +++ b/fs/ext4/indirect.c
>>>> @@ -1219,6 +1219,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  	ext4_lblk_t offsets[4], offsets2[4];
>>>>  	Indirect chain[4], chain2[4];
>>>>  	Indirect *partial, *partial2;
>>>> +	Indirect *p = NULL, *p2 = NULL;
>>>>  	ext4_lblk_t max_block;
>>>>  	__le32 nr = 0, nr2 = 0;
>>>>  	int n = 0, n2 = 0;
>>>> @@ -1260,7 +1261,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  		}
>>>>  
>>>>  
>>>> -		partial = ext4_find_shared(inode, n, offsets, chain, &nr);
>>>> +		partial = p = ext4_find_shared(inode, n, offsets, chain, &nr);
>>>>  		if (nr) {
>>>>  			if (partial == chain) {
>>>>  				/* Shared branch grows from the inode */
>>>> @@ -1285,13 +1286,11 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  				partial->p + 1,
>>>>  				(__le32 *)partial->bh->b_data+addr_per_block,
>>>>  				(chain+n-1) - partial);
>>>> -			BUFFER_TRACE(partial->bh, "call brelse");
>>>> -			brelse(partial->bh);
>>>>  			partial--;
>>>>  		}
>>>>  
>>>>  end_range:
>>>> -		partial2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2);
>>>> +		partial2 = p2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2);
>>>>  		if (nr2) {
>>>>  			if (partial2 == chain2) {
>>>>  				/*
>>>> @@ -1321,16 +1320,14 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  					   (__le32 *)partial2->bh->b_data,
>>>>  					   partial2->p,
>>>>  					   (chain2+n2-1) - partial2);
>>>> -			BUFFER_TRACE(partial2->bh, "call brelse");
>>>> -			brelse(partial2->bh);
>>>>  			partial2--;
>>>>  		}
>>>>  		goto do_indirects;
>>>>  	}
>>>>  
>>>>  	/* Punch happened within the same level (n == n2) */
>>>> -	partial = ext4_find_shared(inode, n, offsets, chain, &nr);
>>>> -	partial2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2);
>>>> +	partial = p = ext4_find_shared(inode, n, offsets, chain, &nr);
>>>> +	partial2 = p2 = ext4_find_shared(inode, n2, offsets2, chain2, &nr2);
>>>>  
>>>>  	/* Free top, but only if partial2 isn't its subtree. */
>>>>  	if (nr) {
>>>> @@ -1387,11 +1384,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  					   partial->p + 1,
>>>>  					   partial2->p,
>>>>  					   (chain+n-1) - partial);
>>>> -			BUFFER_TRACE(partial->bh, "call brelse");
>>>> -			brelse(partial->bh);
>>>> -			BUFFER_TRACE(partial2->bh, "call brelse");
>>>> -			brelse(partial2->bh);
>>>> -			return 0;
>>>> +			goto clean_up;
>>>>  		}
>>>>  
>>>>  		/*
>>>> @@ -1406,8 +1399,6 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  					   partial->p + 1,
>>>>  					   (__le32 *)partial->bh->b_data+addr_per_block,
>>>>  					   (chain+n-1) - partial);
>>>> -			BUFFER_TRACE(partial->bh, "call brelse");
>>>> -			brelse(partial->bh);
>>>>  			partial--;
>>>>  		}
>>>>  		if (partial2 > chain2 && depth2 <= depth) {
>>>> @@ -1415,11 +1406,21 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  					   (__le32 *)partial2->bh->b_data,
>>>>  					   partial2->p,
>>>>  					   (chain2+n2-1) - partial2);
>>>> -			BUFFER_TRACE(partial2->bh, "call brelse");
>>>> -			brelse(partial2->bh);
>>>>  			partial2--;
>>>>  		}
>>>>  	}
>>>> +
>>>> +clean_up:
>>>> +	while (p && p > chain) {
>>>> +		BUFFER_TRACE(p->bh, "call brelse");
>>>> +		brelse(p->bh);
>>>> +		p--;
>>>> +	}
>>>> +	while (p2 && p2 > chain2) {
>>>> +		BUFFER_TRACE(p2->bh, "call brelse");
>>>> +		brelse(p2->bh);
>>>> +		p2--;
>>>> +	}
>>>>  	return 0;
>>>>  
>>>>  do_indirects:
>>>> @@ -1427,7 +1428,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  	switch (offsets[0]) {
>>>>  	default:
>>>>  		if (++n >= n2)
>>>> -			return 0;
>>>> +			break;
>>>>  		nr = i_data[EXT4_IND_BLOCK];
>>>>  		if (nr) {
>>>>  			ext4_free_branches(handle, inode, NULL, &nr, &nr+1, 1);
>>>> @@ -1435,7 +1436,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  		}
>>>>  	case EXT4_IND_BLOCK:
>>>>  		if (++n >= n2)
>>>> -			return 0;
>>>> +			break;
>>>>  		nr = i_data[EXT4_DIND_BLOCK];
>>>>  		if (nr) {
>>>>  			ext4_free_branches(handle, inode, NULL, &nr, &nr+1, 2);
>>>> @@ -1443,7 +1444,7 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  		}
>>>>  	case EXT4_DIND_BLOCK:
>>>>  		if (++n >= n2)
>>>> -			return 0;
>>>> +			break;
>>>>  		nr = i_data[EXT4_TIND_BLOCK];
>>>>  		if (nr) {
>>>>  			ext4_free_branches(handle, inode, NULL, &nr, &nr+1, 3);
>>>> @@ -1452,5 +1453,5 @@ int ext4_ind_remove_space(handle_t *handle, struct inode *inode,
>>>>  	case EXT4_TIND_BLOCK:
>>>>  		;
>>>>  	}
>>>> -	return 0;
>>>> +	goto clean_up;
>>>>  }
>>>> -- 
>>>> 2.7.4
>>>>
>>




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux