On Thu, Jun 08, 2023 at 10:45:05AM -0400, Theodore Ts'o wrote: > Jan, thanks for the comments to Ojaswin's patch series. Since I had > already landed his patch series in my tree and have been testing it, > I've fixed the obvious issues you've raised in a fixup patch > (attached). > > There is one issue which I have not fixed: > > On Wed, Jun 07, 2023 at 12:21:03PM +0200, Jan Kara wrote: > > > + for (i = order; i >= min_order; i--) { > > > + int frag_order; > > > + /* > > > + * Scale down goal len to make sure we find something > > > + * in the free fragments list. Basically, reduce > > > + * preallocations. > > > + */ > > > + ac->ac_g_ex.fe_len = 1 << i; > > > > I smell some off-by-one issues here. Look fls(1) == 1 so (1 << fls(n)) > n. > > Hence this loop will actually *grow* the goal allocation length. Also I'm > > not sure why you have +1 in min_order = fls(ac->ac_o_ex.fe_len) + 1. > > Ojaswin, could you take a look this? Thanks!! > > - Ted > > commit 182d2d90a180838789ed5a19e08c333043d1617a > Author: Theodore Ts'o <tytso@xxxxxxx> > Date: Thu Jun 8 10:39:35 2023 -0400 > > ext4: clean up mballoc criteria comments > > Line wrap and slightly clarify the comments describing mballoc's > cirtiera. > > Define EXT4_MB_NUM_CRS as part of the enum, so that it will > automatically get updated when criteria is added or removed. > > Also fix a potential unitialized use of 'cr' variable if > CONFIG_EXT4_DEBUG is enabled. > > Signed-off-by: Theodore Ts'o <tytso@xxxxxxx> Hi Ted, Patch looks good, thanks for doing this. I've sent the fix for the off by one issue here: https://lore.kernel.org/linux-ext4/20230609103403.112807-1-ojaswin@xxxxxxxxxxxxx/T/#u Jan, thanks for the review. I've addressed the bug for now. Since I'm on vacation for the next one and a half week I might not be able to address the other cleanups. I'll get them done once I'm back. Regards, ojaswin > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > index 6a1f013d23f7..45a531446ea2 100644 > --- a/fs/ext4/ext4.h > +++ b/fs/ext4/ext4.h > @@ -128,47 +128,52 @@ enum SHIFT_DIRECTION { > }; > > /* > - * Number of criterias defined. For each criteria, mballoc has slightly > - * different way of finding the required blocks nad usually, higher the > - * criteria the slower the allocation. We start at lower criterias and keep > - * falling back to higher ones if we are not able to find any blocks. > - */ > -#define EXT4_MB_NUM_CRS 5 > -/* > - * All possible allocation criterias for mballoc. Lower are faster. > + * For each criteria, mballoc has slightly different way of finding > + * the required blocks nad usually, higher the criteria the slower the > + * allocation. We start at lower criterias and keep falling back to > + * higher ones if we are not able to find any blocks. Lower (earlier) > + * criteria are faster. > */ > enum criteria { > /* > - * Used when number of blocks needed is a power of 2. This doesn't > - * trigger any disk IO except prefetch and is the fastest criteria. > + * Used when number of blocks needed is a power of 2. This > + * doesn't trigger any disk IO except prefetch and is the > + * fastest criteria. > */ > CR_POWER2_ALIGNED, > > /* > - * Tries to lookup in-memory data structures to find the most suitable > - * group that satisfies goal request. No disk IO except block prefetch. > + * Tries to lookup in-memory data structures to find the most > + * suitable group that satisfies goal request. No disk IO > + * except block prefetch. > */ > CR_GOAL_LEN_FAST, > > /* > - * Same as CR_GOAL_LEN_FAST but is allowed to reduce the goal length to > - * the best available length for faster allocation. > + * Same as CR_GOAL_LEN_FAST but is allowed to reduce the goal > + * length to the best available length for faster allocation. > */ > CR_BEST_AVAIL_LEN, > > /* > - * Reads each block group sequentially, performing disk IO if necessary, to > - * find find_suitable block group. Tries to allocate goal length but might trim > - * the request if nothing is found after enough tries. > + * Reads each block group sequentially, performing disk IO if > + * necessary, to find find_suitable block group. Tries to > + * allocate goal length but might trim the request if nothing > + * is found after enough tries. > */ > CR_GOAL_LEN_SLOW, > > /* > - * Finds the first free set of blocks and allocates those. This is only > - * used in rare cases when CR_GOAL_LEN_SLOW also fails to allocate > - * anything. > + * Finds the first free set of blocks and allocates > + * those. This is only used in rare cases when > + * CR_GOAL_LEN_SLOW also fails to allocate anything. > */ > CR_ANY_FREE, > + > + /* > + * Number of criterias defined. > + */ > + EXT4_MB_NUM_CRS > }; > > /* criteria below which we use fast block scanning and avoid unnecessary IO */ > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index 8a6896d4e9b0..2f9f5dc720cc 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -2759,7 +2759,7 @@ static noinline_for_stack int > ext4_mb_regular_allocator(struct ext4_allocation_context *ac) > { > ext4_group_t prefetch_grp = 0, ngroups, group, i; > - enum criteria cr, new_cr; > + enum criteria new_cr, cr = CR_GOAL_LEN_FAST; > int err = 0, first_err = 0; > unsigned int nr = 0, prefetch_ios = 0; > struct ext4_sb_info *sbi; > @@ -2816,12 +2816,13 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) > spin_unlock(&sbi->s_md_lock); > } > > - /* Let's just scan groups to find more-less suitable blocks */ > - cr = ac->ac_2order ? CR_POWER2_ALIGNED : CR_GOAL_LEN_FAST; > /* > - * cr == CR_POWER2_ALIGNED try to get exact allocation, > - * cr == CR_ANY_FREE try to get anything > + * Let's just scan groups to find more-less suitable blocks We > + * start with CR_GOAL_LEN_FAST, unless it is power of 2 > + * aligned, in which case let's do that faster approach first. > */ > + if (ac->ac_2order) > + cr = CR_POWER2_ALIGNED; > repeat: > for (; cr < EXT4_MB_NUM_CRS && ac->ac_status == AC_STATUS_CONTINUE; cr++) { > ac->ac_criteria = cr;