A ~5% performance regression was discovered on the aim9.brk_test.ops_per_sec by the linux kernel test bot [0]. In the past to satisfy brk() performance we duplicated VMA expansion code and special-cased do_brk_flags(). This is however horrid and undoes work to abstract this logic, so in resolving the issue I have endeavoured to avoid this. Investigating further I was able to observe that the use of a vma_iter_next_range() and vma_prev() pair, causing an unnecessary maple tree walk. In addition there is work that we do that is simply unnecessary for brk(). Therefore, add a special VMA merge mode VMG_FLAG_JUST_EXPAND to avoid doing any of this - it assumes the VMA iterator is pointing at the previous VMA and which skips logic that brk() does not require. This mostly eliminates the performance regression reducing it to ~2% which is in the realm of noise. In addition, the will-it-scale test brk2, written to be more representative of real-world brk() usage, shows a modest performance improvement - which gives me confidence that we are not meaningfully regressing real workloads here. This series includes a test asserting that the 'just expand' mode works as expected. With many thanks to Oliver Sang for helping with performance testing of candidate patch sets! [0]:https://lore.kernel.org/linux-mm/202409301043.629bea78-oliver.sang@xxxxxxxxx Lorenzo Stoakes (2): mm/vma: add expand-only VMA merge mode and optimise do_brk_flags() tools: testing: add expand-only mode VMA test mm/mmap.c | 3 ++- mm/vma.c | 23 +++++++++++++++-------- mm/vma.h | 14 ++++++++++++++ tools/testing/vma/vma.c | 40 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 71 insertions(+), 9 deletions(-) -- 2.46.2