Re: [PATCH 4/4] ext4: avoid dividing by 0 in mb_update_avg_fragment_size() when block bitmap corrupt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023/12/19 16:02, Baokun Li wrote:
On 2023/12/18 23:09, Jan Kara wrote:
On Mon 18-12-23 15:43:42, Jan Kara wrote:
On Mon 18-12-23 22:18:14, Baokun Li wrote:
When bb_free is not 0 but bb_fragments is 0, return directly to avoid
system crash due to division by zero.
How could this possibly happen? bb_fragments is the number of free space extents and bb_free is the number of free blocks. No free space extents =>
no free blocks seems pretty obvious? You can see the logic in
ext4_mb_generate_buddy()...
Oh, I see. This is probably about "bitmap corrupted case". But still both
allocation and freeing of blocks shouldn't operate on bitmaps marked as
corrupted so this should not happen?

                                Honza
Yes, we should make sure that we don't allocate or free blocks in
groups where the block bitmap has been marked as corrupt, but
there are still some issues here:

1. When a block bitmap is found to be corrupted, ext4_grp_locked_error()
is always called first, and only after that the block bitmap of the group
is marked as corrupted. In ext4_grp_locked_error(), the group may
be unlocked, and then other processes may be able to access the
corrupted bitmap. In this case, we can just put the marking of
corruption before ext4_grp_locked_error().

2. ext4_free_blocks() finds a corrupt bitmap can just return and do
nothing, because there is no problem with not freeing an exception
block. But mb_mark_used() has no logic for determining if a block
bitmap is corrupt, and its caller has no error handling logic, so
mb_mark_used() needs its caller to make sure that it doesn't allocate
blocks in a group with a corrupted block bitmap (which is why it
added the judgment in patch 2). However, it is possible to unlock group
between determining whether the group is corrupt and actually calling
mb_mark_used() to use those blocks. For example, when calling
mb_mark_used() in ext4_mb_try_best_found(), we are determining
whether the group's block bitmap is corrupted or not in the previous
ext4_mb_good_group(), but we are not determining it again when using
the blocks in ext4_mb_try_best_found(), at which point we may be
modifying the corrupted block bitmap.

3. Determine if a block bitmap is corrupted outside of a group lock
in ext4_mb_find_by_goal().

4. In ext4_mb_check_limits(), it may be possible to use the ac_b_ex
found in group 0 while holding a lock in group 1.

I'm very sorry that the fourth point was wrong, I read "||" as "&&" in
ext4_mb_check_limits() :

 if (finish_group || ac->ac_found > sbi->s_mb_min_to_scan)


In addition to the above, there may be some corner cases that cause
inconsistencies, so here we determine if bb_fragments is 0 to avoid a
crash due to division by zero. Perhaps we could just replace
grp->bb_free == 0 with grp->bb_fragments == 0, which wouldn't look
so strange.

Thanks!
--
With Best Regards,
Baokun Li
.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux