On Aug 7, 2020, at 8:01 AM, brookxu <brookxu.cn@xxxxxxxxx> wrote: > > We will make these judgments in ext4_mb_good_group(), maybe there > is no need to repeat judgments here. This patch looks like it _may_ have some performance impact on a large filesystem that has some of the block groups that are mostly full, or the group is already marked corrupt. These checks are trying to avoid initializing the group's block allocation bitmap(s) from disk and initializing the buddy bitmap(s) if that is not actually needed. See comment in ext4_mb_regular_allocator(): /* This now checks without needing the buddy page */ ret = ext4_mb_good_group_nolock(ac, group, cr); err = ext4_mb_load_buddy(sb, group, &e4b); if (err) goto out; ext4_lock_group(sb, group); /* * We need to check again after locking the * block group */ ret = ext4_mb_good_group(ac, group, cr); If those checks are not done in ext4_mb_good_group_nolock() to return 0 to the caller, then the call to ext4_mb_init_group() and ext4_mb_load_buddy() will be done, only to find in ext4_mb_good_group() that the group is no good. There were also some performance-related patches in the past that show the calls to ext4_mb_load_buddy() is expensive with real filesystems: commit 1c8457cadc9cefe7ec920a2f3537ff1fe20f4061 Author: Aditya Kali <adityakali@xxxxxxxxxx> AuthorDate: Sat Jun 30 19:10:57 2012 -0400 ext4: avoid uneeded calls to ext4_mb_load_buddy() while reading mb_groups Currently ext4_mb_load_buddy is called for every group, irrespective of whether the group info is already in memory, while reading /proc/fs/ext4/<partition>/mb_groups proc file. For the purpose of mb_groups proc file, it is unnecessary to load the file group info from disk if it was loaded in past. These calls to ext4_mb_load_buddy make reading the mb_groups proc file expensive. commit 78944086663e6c1b03f3d60bf7610128149be5fc Author: Lukas Czerner <lczerner@xxxxxxxxxx> AuthorDate: Tue May 24 18:16:27 2011 -0400 ext4: only load buddy bitmap in ext4_trim_fs() when it is needed Currently we are loading buddy ext4_mb_load_buddy() for every block group we are going through in ext4_trim_fs() in many cases just to find out that there is not enough space to be bothered with. As Amir Goldstein suggested we can use bb_free information directly from ext4_group_info. This commit removes ext4_mb_load_buddy() from ext4_trim_fs() and rather get the ext4_group_info via ext4_get_group_info() and use the bb_free information directly from that. This avoids unnecessary call to load buddy in the case the group does not have enough free space to trim. I think some options still exist for this patch: - make a helper routine like "ext4_mb_group_unsuitable(ac, grp, cr) that does these common checks and use it in both ext4_mb_good_group() and ext4_mb_good_group_nolock() to reduce code duplication. However, checks in ext4_mb_good_group() have also been changed/removed in another of your patches, so it may not make sense anymore? - add an optimization in ext4_mb_good_group_nolock() to do the unlock/lock only in the unlikely case that GRP_NEED_INIT is true: if (should_lock) ext4_lock_group(sb, group); free = grp->bb_free; if (free == 0) goto out; if (cr <= 2 && free < ac->ac_g_ex.fe_len) goto out; if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp))) goto out; - if (should_lock) - ext4_unlock_group(sb, group); /* We only do this if the grp has never been initialized */ if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) { + if (should_lock) + ext4_unlock_group(sb, group); ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS); if (ret) return ret; + if (should_lock) + ext4_lock_group(sb, group); } - if (should_lock) - ext4_lock_group(sb, group); ret = ext4_mb_good_group(ac, group, cr); out: if (should_lock) ext4_unlock_group(sb, group); return ret; } That will avoid some lock contention on the group lock in the common case, which will probably avoid more overhead than . Cheers, Andreas > Signed-off-by: Chunguang Xu <brookxu@xxxxxxxxxxx> > --- > fs/ext4/mballoc.c | 16 ++-------------- > 1 file changed, 2 insertions(+), 14 deletions(-) > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index 4304113..84871f7 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -2178,21 +2178,8 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac, > struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group); > struct super_block *sb = ac->ac_sb; > bool should_lock = ac->ac_flags & EXT4_MB_STRICT_CHECK; > - ext4_grpblk_t free; > int ret = 0; > > - if (should_lock) > - ext4_lock_group(sb, group); > - free = grp->bb_free; > - if (free == 0) > - goto out; > - if (cr <= 2 && free < ac->ac_g_ex.fe_len) > - goto out; > - if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp))) > - goto out; > - if (should_lock) > - ext4_unlock_group(sb, group); > - > /* We only do this if the grp has never been initialized */ > if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) { > ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS); > @@ -2202,8 +2189,9 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac, > > if (should_lock) > ext4_lock_group(sb, group); > + > ret = ext4_mb_good_group(ac, group, cr); > -out: > + > if (should_lock) > ext4_unlock_group(sb, group); > return ret; > -- > 1.8.3.1 > > Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP