On Tue, Jun 06, 2023 at 10:00:57PM +0800, Guoqing Jiang wrote: > Hello, > > On 5/30/23 20:33, Ojaswin Mujoo wrote: > > Before this patch, the call stack in ext4_run_li_request is as follows: > > > > /* > > * nr = no. of BGs we want to fetch (=s_mb_prefetch) > > * prefetch_ios = no. of BGs not uptodate after > > * ext4_read_block_bitmap_nowait() > > */ > > next_group = ext4_mb_prefetch(sb, group, nr, prefetch_ios); > > ext4_mb_prefetch_fini(sb, next_group prefetch_ios); > > > > ext4_mb_prefetch_fini() will only try to initialize buddies for BGs in > > range [next_group - prefetch_ios, next_group). This is incorrect since > > sometimes (prefetch_ios < nr), which causes ext4_mb_prefetch_fini() to > > incorrectly ignore some of the BGs that might need initialization. This > > issue is more notable now with the previous patch enabling "fetching" of > > BLOCK_UNINIT BGs which are marked buffer_uptodate by default. > > > > Fix this by passing nr to ext4_mb_prefetch_fini() instead of > > prefetch_ios so that it considers the right range of groups. > > Thanks for the series. > > > Similarly, make sure we don't pass nr=0 to ext4_mb_prefetch_fini() in > > ext4_mb_regular_allocator() since we might have prefetched BLOCK_UNINIT > > groups that would need buddy initialization. > > Seems ext4_mb_prefetch_fini can't be called by ext4_mb_regular_allocator > if nr is 0. Hi Guoqing, Sorry I was on vacation so didn't get a chance to reply to this sooner. Let me explain what I meant by that statement in the commit message. So basically, the prefetch_ios output argument is incremented whenever ext4_mb_prefetch() reads a block group with !buffer_uptodate(bh). However, for BLOCK_UNINIT BGs the buffer is marked uptodate after initialization and hence prefetch_ios is not incremented when such BGs are prefetched. This leads to nr becoming 0 due to the following line (removed in this patch): if (prefetch_ios == curr_ios) nr = 0; hence ext4_mb_prefetch_fini() would never pre initialise the corresponding buddy structures. Instead, these structures would then get initialized probably at a later point during the slower allocation criterias. The motivation of making sure the BLOCK_UNINIT BGs' buddies are pre initialized is so the faster allocation criterias can utilize the data to make better decisions. Regards, ojaswin > > https://elixir.bootlin.com/linux/v6.4-rc5/source/fs/ext4/mballoc.c#L2816 > > Am I miss something? > > Thanks, > Guoqing >