From: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> Subject: [PATCH] ext4: optimizing group serch for inode allocation Added a check at the start of group search loop to avoid looping unecessarily in case of empty group. This also allow group search to jump directly to "found_flex_bg" with "stats" and "group" already set, so there is no need to go through the extra steps of setting "best_desc" , "best_group" and then break out of the loop just to set "stats" and "group" again. Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> --- fs/ext4/ialloc.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 1b8024d..3463d30 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -477,6 +477,12 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, (ext4_test_inode_flag(parent, EXT4_INODE_TOPDIR)))) { int best_ndir = inodes_per_group; int ret = -1; + /* Maximum usable blocks per group as some blocks are used + * for inode tables and allocation bitmaps */ + unsigned long long max_blocks_per_flex = + ((sbi->s_blocks_per_group - + (sbi->s_itb_per_group + 2)) * + flex_size) >> sbi->s_cluster_bits; if (qstr) { hinfo.hash_version = DX_HASH_HALF_MD4; @@ -489,6 +495,10 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, for (i = 0; i < ngroups; i++) { g = (parent_group + i) % ngroups; get_orlov_stats(sb, g, flex_size, &stats); + /* can't get better group than empty group */ + if (max_blocks_per_flex == stats.free_clusters && + (inodes_per_group * flex_size) == stats.free_inodes) + goto found_flex_bg; if (!stats.free_inodes) continue; if (stats.used_dirs >= best_ndir) -- 1.7.1 On Mon, Jan 18, 2016 at 02:50:48PM -0700, Andreas Dilger wrote: > On Jan 16, 2016, at 4:02 PM, Lokesh Jaliminche <lokesh.jaliminche@xxxxxxxxx> wrote: > > > > From: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> > > Date: Thu, 7 Jan 2016 05:12:20 +0530 > > Subject: [PATCH v5] ext4: optimizing group serch for inode allocation > > > > Added a check at the start of group search loop to > > avoid looping unecessarily in case of empty group. > > This also allow group search to jump directly to > > "found_flex_bg" with "stats" and "group" already set, > > so there is no need to go through the extra steps of > > setting "best_desc" , "best_group" and then break > > out of the loop just to set "stats" and "group" again. > > This looks very good, just one minor potential issue below. > > > Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> > > --- > > fs/ext4/ialloc.c | 10 ++++++++++ > > 1 files changed, 10 insertions(+), 0 deletions(-) > > > > diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c > > index 1b8024d..d2835cc 100644 > > --- a/fs/ext4/ialloc.c > > +++ b/fs/ext4/ialloc.c > > @@ -477,6 +477,12 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, > > (ext4_test_inode_flag(parent, EXT4_INODE_TOPDIR)))) { > > int best_ndir = inodes_per_group; > > int ret = -1; > > + /* Maximum usable blocks per group as some blocks are used > > + * for inode tables and allocation bitmaps */ > > + unsigned int max_blocks_per_flex = > > This should be an unsigned long long or __u64, since the max_blocks_per_flex > calculation may overflow a 32-bit value on filesystems with larger block_size > and a large value for flex_size (larger than 2^13 for a 64KB blocksize). > > The "stats.free_clusters" is already a __u64 for this reason. > > > + ((sbi->s_blocks_per_group - > > + (sbi->s_itb_per_group + 2)) * > > + flex_size) >> sbi->s_cluster_bits; > > > > if (qstr) { > > hinfo.hash_version = DX_HASH_HALF_MD4; > > @@ -489,6 +495,10 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, > > for (i = 0; i < ngroups; i++) { > > g = (parent_group + i) % ngroups; > > get_orlov_stats(sb, g, flex_size, &stats); > > + /* can't get better group than empty group */ > > + if (max_blocks_per_flex == stats.free_clusters && > > + (inodes_per_group * flex_size) == stats.free_inodes) > > + goto found_flex_bg; > > if (!stats.free_inodes) > > continue; > > if (stats.used_dirs >= best_ndir) > > -- > > 1.7.1 > > > > > > > > On Mon, Jan 11, 2016 at 02:13:46PM -0700, Andreas Dilger wrote: > >> On Jan 6, 2016, at 5:46 PM, Lokesh Jaliminche <lokesh.jaliminche@xxxxxxxxx> wrote: > >>> > >>> you are right flex groups cannot be completely empty as some > >>> blocks are used for inode table and allocation bitmaps. I > >>> missed that point. I have corrected that. Also I have tested > >>> and validated this patch by putting some traces it works! > >> > >> Good news. Thanks for updating the patch and testing it. > >> More comments below. > >> > >>> Here are corresponding logs: > >>> commands: > >>> ========= > >>> mkfs.ext4 -g 10000 -G 2 /dev/sdc > >>> mount -t ext4 /dev/sdc /mnt/test_ext4/ > >>> cd /mnt/test_ext4 > >>> mkdir 1 > >>> cd 1 > >>> > >>> logs: > >>> ===== > >>> Jan 7 03:36:45 root kernel: [ 64.792939] max_usable_blocks_per_flex_group = [19692] > >>> Jan 7 03:36:45 root kernel: [ 64.792939] stats.free_clusters = [19264] > >>> Jan 7 03:36:45 root kernel: [ 64.792939] Inodes_per_flex_group = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792939] stats.free_inodes = [4853] > >>> Jan 7 03:36:45 root kernel: [ 64.792950] max_usable_blocks_per_flex_group = [19692] > >>> Jan 7 03:36:45 root kernel: [ 64.792950] stats.free_clusters = [19481] > >>> Jan 7 03:36:45 root kernel: [ 64.792950] Inodes_per_flex_group = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792950] stats.free_inodes = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792958] max_usable_blocks_per_flex_group = [19692] > >>> Jan 7 03:36:45 root kernel: [ 64.792958] stats.free_clusters = [19481] > >>> Jan 7 03:36:45 root kernel: [ 64.792958] Inodes_per_flex_group = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792958] stats.free_inodes = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792965] max_usable_blocks_per_flex_group = [19692] > >>> Jan 7 03:36:45 root kernel: [ 64.792965] stats.free_clusters = [19481] > >>> Jan 7 03:36:45 root kernel: [ 64.792965] Inodes_per_flex_group = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792965] stats.free_inodes = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792972] max_usable_blocks_per_flex_group = [19692] > >>> Jan 7 03:36:45 root kernel: [ 64.792972] stats.free_clusters = [19481] > >>> Jan 7 03:36:45 root kernel: [ 64.792972] Inodes_per_flex_group = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792972] stats.free_inodes = [4864] > >>> Jan 7 03:36:45 root kernel: [ 64.792979] max_usable_blocks_per_flex_group = [19692]<<<<<<<<<<< > >>> Jan 7 03:36:45 root kernel: [ 64.792979] stats.free_clusters = [19692]<<<<<<<<<<< > >>> Jan 7 03:36:45 root kernel: [ 64.792979] Inodes_per_flex_group = [4864]<<<<<<<<<<<< > >>> Jan 7 03:36:45 root kernel: [ 64.792979] stats.free_inodes = [4864]<<<<<<<<<<<< > >>> Jan 7 03:36:45 root kernel: [ 64.792985] found_flex_bg >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>condition satisfied > >>> > >>> > >>> [Patch v4]: > >>> ========== > >>> From: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> > >>> Date: Thu, 7 Jan 2016 05:12:20 +0530 > >>> Subject: [PATCH] ext4: optimizing group serch for inode allocation > >>> > >>> Added a check at the start of group search loop to > >>> avoid looping unecessarily in case of empty group. > >>> This also allow group search to jump directly to > >>> "found_flex_bg" with "stats" and "group" already set, > >>> so there is no need to go through the extra steps of > >>> setting "best_desc" , "best_group" and then break > >>> out of the loop just to set "stats" and "group" again. > >>> > >>> Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> > >>> --- > >>> fs/ext4/ext4.h | 2 ++ > >>> fs/ext4/ialloc.c | 21 +++++++++++++++++++-- > >>> 2 files changed, 21 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > >>> index cc7ca4e..fc48f9a 100644 > >>> --- a/fs/ext4/ext4.h > >>> +++ b/fs/ext4/ext4.h > >>> @@ -326,6 +326,8 @@ struct flex_groups { > >>> #define EXT4_MAX_DESC_SIZE EXT4_MIN_BLOCK_SIZE > >>> #define EXT4_DESC_SIZE(s) (EXT4_SB(s)->s_desc_size) > >>> #ifdef __KERNEL__ > >>> +# define EXT4_INODE_TABLE_BLOCKS_PER_GROUP(s) (EXT4_SB(s)->s_itb_per_group) > >>> +# define EXT4_BLOCKS_PER_CLUSTER(s) (EXT4_SB(s)->s_cluster_ratio) > >> > >> I don't think there is much value to these macros. They aren't much > >> shorter than the actual variable access, and they don't provide any > >> improved clarity for the reader. > >> > >> I think they were introduced many years ago so that this header could > >> be shared between the kernel and e2fsprogs, but that is no longer true > >> so I don't think there is a need to add new ones. > >> > >> Ted, any opinion on this? > >> > >>> # define EXT4_BLOCKS_PER_GROUP(s) (EXT4_SB(s)->s_blocks_per_group) > >>> # define EXT4_CLUSTERS_PER_GROUP(s) (EXT4_SB(s)->s_clusters_per_group) > >>> # define EXT4_DESC_PER_BLOCK(s) (EXT4_SB(s)->s_desc_per_block) > >>> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c > >>> index 1b8024d..dc66ad8 100644 > >>> --- a/fs/ext4/ialloc.c > >>> +++ b/fs/ext4/ialloc.c > >>> @@ -463,7 +463,6 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, > >>> sbi->s_log_groups_per_flex; > >>> parent_group >>= sbi->s_log_groups_per_flex; > >>> } > >>> - > >>> freei = percpu_counter_read_positive(&sbi->s_freeinodes_counter); > >>> avefreei = freei / ngroups; > >> > >> Not sure why this line is being deleted? > >> > >>> @@ -477,7 +476,20 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, > >>> (ext4_test_inode_flag(parent, EXT4_INODE_TOPDIR)))) { > >>> int best_ndir = inodes_per_group; > >>> int ret = -1; > >>> - > >>> + /* blocks used for inode table and allocation bitmaps */ > >>> + unsigned int max_usable_blocks_per_flex_group; > >> > >> This could be shorter and still clear, like "max_blocks_per_flex". > >> > >>> + unsigned int blocks_per_group = EXT4_BLOCKS_PER_GROUP(sb); > >>> + unsigned int inode_table_blocks_per_group = \ > >>> + EXT4_INODE_TABLE_BLOCKS_PER_GROUP(sb); > >>> + /* Number of blocks per cluster */ > >>> + unsigned int cluster_ratio = EXT4_BLOCKS_PER_CLUSTER(sb); > >> > >> There also isn't much value to making local variables that hold these > >> same values again. It increases stack space for little benefit. They > >> are only accessed once anyway. > >> > >>> + unsigned int inodes_per_flex_group = inodes_per_group * \ > >>> + flex_size; > >>> + max_usable_blocks_per_flex_group = \ > >>> + ((blocks_per_group*flex_size) - \ > >> > >> Spaces around that '*'. > >> > >>> + (inode_table_blocks_per_group * \ > >>> + flex_size + 2 * flex_size)) / \ > >> > >> This is C and not a CPP macro, so no need for linefeed escapes. > >> > >>> + cluster_ratio; > >> > >> This should be ">> EXT4_SB(sb)->s_cluster_bits" to avoid the divide. > >> > >>> if (qstr) { > >>> hinfo.hash_version = DX_HASH_HALF_MD4; > >>> hinfo.seed = sbi->s_hash_seed; > >>> @@ -489,6 +501,11 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, > >>> for (i = 0; i < ngroups; i++) { > >>> g = (parent_group + i) % ngroups; > >>> get_orlov_stats(sb, g, flex_size, &stats); > >>> + /* can't get better group than empty group */ > >>> + if (max_usable_blocks_per_flex_group == \ > >>> + stats.free_clusters && \ > >> > >> No need for linefeed escapes. > >> > >>> + inodes_per_flex_group == stats.free_inodes) > >> > >> Align continued line after '(' from the "if (" line. > >> > >>> + goto found_flex_bg; > >> > >> This is indented too much. It will work properly when you fix the > >> previous line. > >> > >>> if (!stats.free_inodes) > >>> continue; > >>> if (stats.used_dirs >= best_ndir) > >>> -- > >>> 1.7.1 > >>> > >>> Regards, > >>> Lokesh > >>> > >>> On Sun, Jan 03, 2016 at 11:34:51AM -0700, Andreas Dilger wrote: > >>>> On Dec 26, 2015, at 01:41, Lokesh Jaliminche <lokesh.jaliminche@xxxxxxxxx> wrote: > >>>>> > >>>>> > >>>>> From 5199982d29ff291b181c25af63a22d6f2d6d3f7b Mon Sep 17 00:00:00 2001 > >>>>> From: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> > >>>>> Date: Sat, 26 Dec 2015 13:19:36 +0530 > >>>>> Subject: [PATCH v3] ext4: optimizing group search for inode allocation > >>>>> > >>>>> Added a check at the start of group search loop to > >>>>> avoid looping unecessarily in case of empty group. > >>>>> This also allow group search to jump directly to > >>>>> "found_flex_bg" with "stats" and "group" already set, > >>>>> so there is no need to go through the extra steps of > >>>>> setting "best_desc" and "best_group" and then break > >>>>> out of the loop just to set "stats" and "group" again. > >>>>> > >>>>> Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> > >>>>> --- > >>>>> fs/ext4/ialloc.c | 12 ++++++++++-- > >>>>> 1 files changed, 10 insertions(+), 2 deletions(-) > >>>>> > >>>>> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c > >>>>> index 1b8024d..16f94db 100644 > >>>>> --- a/fs/ext4/ialloc.c > >>>>> +++ b/fs/ext4/ialloc.c > >>>>> @@ -473,10 +473,14 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, > >>>>> ndirs = percpu_counter_read_positive(&sbi->s_dirs_counter); > >>>>> > >>>>> if (S_ISDIR(mode) && > >>>>> - ((parent == d_inode(sb->s_root)) || > >>>>> - (ext4_test_inode_flag(parent, EXT4_INODE_TOPDIR)))) { > >>>>> + ((parent == d_inode(sb->s_root)) || > >>>>> + (ext4_test_inode_flag(parent, EXT4_INODE_TOPDIR)))) { > >>>> > >>>> I don't understand why you are changing the indentation here? > >>>> > >>>>> + unsigned int inodes_per_flex_group; > >>>>> + unsigned long int clusters_per_flex_group; > >>>>> int best_ndir = inodes_per_group; > >>>>> int ret = -1; > >>>>> + inodes_per_flex_group = inodes_per_group * flex_size; > >>>>> + clusters_per_flex_group = sbi->s_clusters_per_group * flex_size; > >>>> > >>>> Have you actually tested if this code works? When I was looking at this > >>>> code I was accidentally looking at the ext2 version, which only deals with > >>>> individual groups, and not flex groups. I don't think that a flex group > >>>> can actually be completely empty, since it will always contain at least > >>>> an inode table and some bitmaps. > >>>> > >>>> This calculation needs to take this into account or it will just be > >>>> overhead and not an optimization. Something like: > >>>> > >>>> /* account for inode table and allocation bitmaps */ > >>>> clusters_per_flex_group = sbi->s_clusters_per_group * flex_size - > >>>> ((inodes_per_flex_group * EXT4_INODE_SIZE(sb) + > >>>> (flex_size << sb->s_blocksize_bits) * 2 ) >> > >>>> EXT4_SB(sb)->s_cluster_bits; > >>>> > >>>> It would be worthwhile to print this out and verify that there are actually > >>>> groups with this many free blocks in a newly formatted filesystem. > >>>> > >>>> Cheers, Andreas > >>>> > >>>>> if (qstr) { > >>>>> hinfo.hash_version = DX_HASH_HALF_MD4; > >>>>> @@ -489,6 +493,10 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, > >>>>> for (i = 0; i < ngroups; i++) { > >>>>> g = (parent_group + i) % ngroups; > >>>>> get_orlov_stats(sb, g, flex_size, &stats); > >>>>> + /* the group can't get any better than empty */ > >>>>> + if (inodes_per_flex_group == stats.free_inodes && > >>>>> + clusters_per_flex_group == stats.free_clusters) > >>>>> + goto found_flex_bg; > >>>>> if (!stats.free_inodes) > >>>>> continue; > >>>>> if (stats.used_dirs >= best_ndir) > >>>>> -- > >>>>> 1.7.1 > >>>>> > >>>>> > >>>>>> On Mon, Dec 21, 2015 at 10:27:37AM -0700, Andreas Dilger wrote: > >>>>>>> On Dec 18, 2015, at 4:32 PM, lokesh jaliminche <lokesh.jaliminche@xxxxxxxxx> wrote: > >>>>>>> > >>>>>>> From 9e09fef78b2fa552c883bf8124af873abfde0805 Mon Sep 17 00:00:00 2001 > >>>>>>> From: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxxxx> > >>>>>> > >>>>>> In the patch summary there is a typo - s/serch/search/ > >>>>>> > >>>>>> Also, it appears your email client has mangled the whitespace in the > >>>>>> patch. Please use "git send-email" to send your patch to the list: > >>>>>> > >>>>>> git send-email --to tytso@xxxxxxx --cc linux-ext4@xxxxxxxxxxxxxxx HEAD~1 > >>>>>> > >>>>>> so that it arrives intact. You probably need to set it up in ~/.gitconfig: > >>>>>> > >>>>>> [sendemail] > >>>>>> confirm = compose > >>>>>> smtpdomain = {your client hostname} > >>>>>> smtpserver = {your SMTP server hostname} > >>>>>> > >>>>>> Cheers, Andreas > >>>>>> > >>>>>>> Added a check at the start of group search loop to > >>>>>>> avoid looping unecessarily in case of empty group. > >>>>>>> This also allow group search to jump directly to > >>>>>>> "found_flex_bg" with "stats" and "group" already set, > >>>>>>> so there is no need to go through the extra steps of > >>>>>>> setting "best_desc" and "best_group" and then break > >>>>>>> out of the loop just to set "stats" and "group" again. > >>>>>>> > >>>>>>> Signed-off-by: Lokesh N Jaliminche <lokesh.jaliminche@xxxxxxxxx> > >>>>>>> --- > >>>>>>> fs/ext4/ialloc.c | 8 ++++++++ > >>>>>>> 1 files changed, 8 insertions(+), 0 deletions(-) > >>>>>>> > >>>>>>> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c > >>>>>>> index 1b8024d..588bf8e 100644 > >>>>>>> --- a/fs/ext4/ialloc.c > >>>>>>> +++ b/fs/ext4/ialloc.c > >>>>>>> @@ -446,6 +446,8 @@ static int find_group_orlov(struct super_block > >>>>>>> *sb, struct inode *parent, > >>>>>>> struct ext4_sb_info *sbi = EXT4_SB(sb); > >>>>>>> ext4_group_t real_ngroups = ext4_get_groups_count(sb); > >>>>>>> int inodes_per_group = EXT4_INODES_PER_GROUP(sb); > >>>>>>> + unsigned int inodes_per_flex_group; > >>>>>>> + long unsigned int blocks_per_clustre; > >>>>>>> unsigned int freei, avefreei, grp_free; > >>>>>>> ext4_fsblk_t freeb, avefreec; > >>>>>>> unsigned int ndirs; > >>>>>>> @@ -470,6 +472,8 @@ static int find_group_orlov(struct super_block > >>>>>>> *sb, struct inode *parent, > >>>>>>> percpu_counter_read_positive(&sbi->s_freeclusters_counter)); > >>>>>>> avefreec = freeb; > >>>>>>> do_div(avefreec, ngroups); > >>>>>>> + inodes_per_flex_group = inodes_per_group * flex_size; > >>>>>>> + blocks_per_clustre = sbi->s_blocks_per_group * flex_size; > >>>>>>> ndirs = percpu_counter_read_positive(&sbi->s_dirs_counter); > >>>>>>> > >>>>>>> if (S_ISDIR(mode) && > >>>>>>> @@ -489,6 +493,10 @@ static int find_group_orlov(struct super_block > >>>>>>> *sb, struct inode *parent, > >>>>>>> for (i = 0; i < ngroups; i++) { > >>>>>>> g = (parent_group + i) % ngroups; > >>>>>>> get_orlov_stats(sb, g, flex_size, &stats); > >>>>>>> + /* the group can't get any better than empty */ > >>>>>>> + if (inodes_per_flex_group == stats.free_inodes && > >>>>>>> + blocks_per_clustre == stats.free_clusters) > >>>>>>> + goto found_flex_bg; > >>>>>>> if (!stats.free_inodes) > >>>>>>> continue; > >>>>>>> if (stats.used_dirs >= best_ndir) > >>>>>>> -- > >>>>>>> 1.7.1 > >>>>>>> <0001-EXT4-optimizing-group-serch-for-inode-allocation.patch> > >>>>>> > >>>>>> > >>>>>> Cheers, Andreas > >>>>> > >>>>> > >>>>> <0001-ext4-optimizing-group-serch-for-inode-allocation.patch> > >>> <0001-ext4-optimizing-group-serch-for-inode-allocation.patch> > >> > >> > >> Cheers, Andreas > >> > >> > >> > >> > >> > > > > > > <0001-ext4-optimizing-group-serch-for-inode-allocation.patch> > > > Cheers, Andreas > > > > >
>From 13530650a7dd1a0937c72e6a6acbb1bab59c3bbd Mon Sep 17 00:00:00 2001 From: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> Date: Thu, 7 Jan 2016 05:12:20 +0530 Subject: [PATCH] ext4: optimizing group serch for inode allocation Added a check at the start of group search loop to avoid looping unecessarily in case of empty group. This also allow group search to jump directly to "found_flex_bg" with "stats" and "group" already set, so there is no need to go through the extra steps of setting "best_desc" , "best_group" and then break out of the loop just to set "stats" and "group" again. Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@xxxxxxxxx> --- fs/ext4/ialloc.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 1b8024d..3463d30 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -477,6 +477,12 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, (ext4_test_inode_flag(parent, EXT4_INODE_TOPDIR)))) { int best_ndir = inodes_per_group; int ret = -1; + /* Maximum usable blocks per group as some blocks are used + * for inode tables and allocation bitmaps */ + unsigned long long max_blocks_per_flex = + ((sbi->s_blocks_per_group - + (sbi->s_itb_per_group + 2)) * + flex_size) >> sbi->s_cluster_bits; if (qstr) { hinfo.hash_version = DX_HASH_HALF_MD4; @@ -489,6 +495,10 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent, for (i = 0; i < ngroups; i++) { g = (parent_group + i) % ngroups; get_orlov_stats(sb, g, flex_size, &stats); + /* can't get better group than empty group */ + if (max_blocks_per_flex == stats.free_clusters && + (inodes_per_group * flex_size) == stats.free_inodes) + goto found_flex_bg; if (!stats.free_inodes) continue; if (stats.used_dirs >= best_ndir) -- 1.7.1