On 2024/5/15 17:13, Luis Henriques wrote: > On Wed 15 May 2024 04:52:54 PM +08, Zhang Yi wrote; > >> On 2024/5/15 16:28, Luis Henriques wrote: >>> On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote; >>> >>>> On 2024/5/14 21:04, Luis Henriques wrote: >>>>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote; >>>>> >>>>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote: >>>>>>> When doing fast_commit replay an infinite loop may occur due to an >>>>>>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does >>>>>>> not detect the replay and calls ext4_es_find_extent_range(), which will >>>>>>> return immediately without initializing the 'es' variable. >>>>>>> >>>>>>> Because 'es' contains garbage, an integer overflow may happen causing an >>>>>>> infinite loop in this function, easily reproducible using fstest generic/039. >>>>>>> >>>>>>> This commit fixes this issue by detecting the replay in function >>>>>>> ext4_ext_determine_insert_hole(). It also adds initialization code to the >>>>>>> error path in function ext4_es_find_extent_range(). >>>>>>> >>>>>>> Thanks to Zhang Yi, for figuring out the real problem! >>>>>>> >>>>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path") >>>>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@xxxxxxxxx> >>>>>>> --- >>>>>>> Hi! >>>>>>> >>>>>>> Two comments: >>>>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero >>>>>>> macro instead. I decided not to do so simply because I wasn't sure if >>>>>>> that would be safe, but I'm fine changing that if you think it is. >>>>>>> >>>>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in >>>>>>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid >>>>>>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right >>>>>>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. >>>>>>> >>>>>>> And again thanks to Zhang Yi for pointing me the *real* problem! >>>>>>> >>>>>>> fs/ext4/extents.c | 6 +++++- >>>>>>> fs/ext4/extents_status.c | 5 ++++- >>>>>>> 2 files changed, 9 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >>>>>>> index e57054bdc5fd..b5bfcb6c18a0 100644 >>>>>>> --- a/fs/ext4/extents.c >>>>>>> +++ b/fs/ext4/extents.c >>>>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, >>>>>>> ext4_lblk_t hole_start, len; >>>>>>> struct extent_status es; >>>>>>> >>>>>>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>>>>>> + return 0; >>>>>>> + >>>>>> >>>>>> Sorry, I think it's may not correct. When replaying the jouranl, although >>>>>> we don't use the extent statue tree, we still need to query the accurate >>>>>> hole length, e.g. please see skip_hole(). If you do this, the hole length >>>>>> becomes incorrect, right? >>>>> >>>>> Thank you for your review (and sorry for my delay replying). >>>>> >>>>> So, I see three different options to follow your suggestion: >>>>> >>>>> 1) Initialize 'es' immediately when declaring it in function >>>>> ext4_ext_determine_insert_hole(): >>>>> >>>>> es.es_lblk = es.es_len = es.es_pblk = 0; >>>>> >>>>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an >>>>> fc replay is in progress (my patch was already doing something like >>>>> that): >>>>> >>>>> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { >>>>> /* Initialize extent to zero */ >>>>> es->es_lblk = es->es_len = es->es_pblk = 0; >>>>> return; >>>>> } >>>>> >>>>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(), >>>>> which will then unconditionally call __es_find_extent_range(). This >>>>> will effectively also initialize the 'es' fields to '0' and, because >>>>> __es_tree_search() will return NULL (at least in generic/039 test!), >>>>> nothing else will be done. >>>>> >>>>> Since all these 3 options seem to have the same result, I believe option >>>>> 1) is probably the best as it initializes the structure shortly after it's >>>>> declaration. Would you agree? Or did I misunderstood you? >>>>> >>>> >>>> Both 1 and 2 are looks fine to me, but I would prefer to initialize it >>>> unconditionally in ext4_es_find_extent_range(). >>>> >>>> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode, >>>> ext4_lblk_t lblk, ext4_lblk_t end, >>>> struct extent_status *es) >>>> { >>>> + es->es_lblk = es->es_len = es->es_pblk = 0; >>>> + >>>> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>>> return; >>> >>> Thank you, Yi. I'll send out v2 shortly. Although, to be fair, the real >>> patch author shouldn't be me. :-) >>> >> >> Never mind, I just give a suggestion and also I didn't do a full test on >> this change. > > Oh, talking about testing, I forgot to mention that I see the same > behaviour with generic/311. I.e. this test also enters an infinite loop, > but fixed with this patch. > Yeah, generic/311 also does a lot of mount && journal recovery operations, and there maybe some other fault injection tests could have the same results, it's all right now. :) Thanks, Yi.