Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks()

Zhang Yi <yi.zhang@xxxxxxxxxxxxxxx> · Fri, 10 May 2024 19:40:52 +0800

On 2024/5/10 17:41, Luis Henriques wrote:
> On Fri 10 May 2024 11:39:48 AM +08, Zhang Yi wrote;
> 
>> On 2024/5/10 1:23, Luis Henriques wrote:
>>> On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;
>>>
>>>> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>>>>>
>>>>> It's looks like it's easy to trigger an infinite loop here using fstest
>>>>> generic/039.  If I understand it correctly (which doesn't happen as often
>>>>> as I'd like), this is due to an integer overflow in the 'if' condition,
>>>>> and should be fixed with the patch below.
>>>>
>>>> Thanks for the report.  However, I can't reproduce the failure, and
>>>> looking at generic/039, I don't see how it could be relevant to the
>>>> code path in question.  Generic/039 creates a test symlink with two
>>>> hard links in the same directory, syncs the file system, and then
>>>> removes one of the hard links, and then drops access to the block
>>>> device using dmflakey.  So I don't see how the extent code would be
>>>> involved at all.  Are you sure that you have the correct test listed?
>>>
>>> Yep, I just retested and it's definitely generic/039.  I'm using a simple
>>> test environment, with virtme-ng.
>>>
>>>> Looking at the code in question in fs/ext4/extents.c:
>>>>
>>>> again:
>>>> 	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
>>>> 				  hole_start + len - 1, &es);
>>>> 	if (!es.es_len)
>>>> 		goto insert_hole;
>>>>
>>>>   	 * There's a delalloc extent in the hole, handle it if the delalloc
>>>>   	 * extent is in front of, behind and straddle the queried range.
>>>>   	 */
>>>>  -	if (lblk >= es.es_lblk + es.es_len) {
>>>>  +	if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
>>>>   		/*
>>>>   		 * The delalloc extent is in front of the queried range,
>>>>   		 * find again from the queried start block.
>>>> 		len -= lblk - hole_start;
>>>> 		hole_start = lblk;
>>>> 		goto again;
>>>>
>>>> lblk and es.es_lblk are both __u32.  So the infinite loop is
>>>> presumably because es.es_lblk + es.es_len has overflowed.  This should
>>>> never happen(tm), and in fact we have a test for this case which
>>>
>>> If I instrument the code, I can see that es.es_len is definitely set to
>>> EXT_MAX_BLOCKS, which will overflow.
>>>
>>
>> Thanks for the report. After looking at the code, I think the root
>> cause of this issue is the variable es was not initialized on replaying
>> fast commit. ext4_es_find_extent_range() will return directly when
>> EXT4_FC_REPLAY flag is set, and then the es.len becomes stall.
>>
>> I can always reproduce this issue on generic/039 with
>> MKFS_OPTIONS="-O fast_commit".
>>
>> This uninitialization problem originally existed in the old
>> ext4_ext_put_gap_in_cache(), but it didn't trigger any real problem
>> since we never check and use extent cache when replaying fast commit.
>> So I suppose the correct fix would be to unconditionally initialize
>> the es variable.
> 
> Oh, you're absolutely right -- the extent_status 'es' struct isn't being
> initialized in that case.  I totally failed to see that.  And yes, I also
> failed to mention I had 'fast_commit' feature enabled, sorry!
> 
> Thanks a lot for figuring this out, Yi.  I'm looking at this code and
> trying to understand if it would be safe to call __es_find_extent_range()
> when EXT4_FC_REPLAY is in progress.  Probably not, and probably better to
> simply do:
> 
> 	es->es_lblk = es->es_len = es->es_pblk = 0;
> 
> in that case.  I'll send out a patch later today.
> 

Yeah, I'm glad it could help.

Thanks,
Yi.