Hello all! This patch changes the linear searching in CR_GOAL_LEN_SLOW to use an array of lists based on the order of requested length. The list array itself holds one list for each possible order of free blocks in a block group. While running some performance tests on my setup, I noticed that although the groups we were considering before finding a good group in CR_GOAL_LEN_SLOW did improve however there was not much improvement (or regression) in performance as such, which I believe could be partly because my test disk was not big enough to highlight the issues of linear traversal in CR_GOAL_LEN_SLOW. Unfortunately, I currently don't have a disk that might be big enough for this however I think theoretically we should see improvement when we have a very large disk and most of the BGs are near full. Suggestions on alternate ways to test this or help in getting some performance numbers is welcome. That being said, I do believe that other than performance, having such a design in CR_GOAL_LEN_SLOW is beneficial because: 1. We'll be able to select a BG whose free block order matches our goal length and hence we won't be filling up bigger holes for small requests. This can improve fragmentation in the longer run. 2. We'll have more control in what block groups we want to allocate from, making CR_GOAL_LEN_SLOW more flexible. For example, the ongoing discussions around introducing new types of BGs marked w/ IOPS flag [1] which we only want to use for metadata allocation. With the current design of linear search, we'll have to check for this flag everytime and skip the BG if we have a data allocation. However, with the proposed design, we can easily set up our freelist to ignore IOPS BGs, which is much more inline with how that patch handles other criterias. Now, with this patch, we'll no longer look for BGs linearly and hence this has a chance of increasing the spread of allocation however I think that shouldn't be a problem since we still have the MB_DEFAULT_LINEAR_LIMIT for rotational devices combined with Jan's (merged) patchset that fixed several issues related to allocation spread [2]. This seems to pass quick xfstests and several iterations for the generic/269 test that stresses the mballoc code. However, there's still some todos in the patchset eg more testing, refactoring code, bug hunting etc. I wanted to get the RFC out to check what the community feels about this and for any suggestions and ideas. Thanks! [1] https://lore.kernel.org/linux-ext4/OS3P286MB056789DF4EBAA7363A4346B5AF06A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ [2] https://lore.kernel.org/all/20220908091301.147-1-jack@xxxxxxx/ Ojaswin Mujoo (1): ext4: Replace linear search with array of lists in CR_GOAL_LEN_SLOW fs/ext4/ext4.h | 25 +++++++-- fs/ext4/mballoc.c | 134 ++++++++++++++++++++++++++++++++++++++++++++-- fs/ext4/mballoc.h | 5 ++ 3 files changed, 156 insertions(+), 8 deletions(-) -- 2.31.1