* Yin, Fengwei <fengwei.yin@xxxxxxxxx> [230602 04:11]: > Hi Liam, > > On 6/1/2023 10:15 AM, Liam R. Howlett wrote: > > Initial work on preallocations showed no regression in performance > > during testing, but recently some users (both on [1] and off [android] > > list) have reported that preallocating the worst-case number of nodes > > has caused some slow down. This patch set addresses the number of > > allocations in a few ways. > > > > During munmap() most munmap() operations will remove a single VMA, so > > leverage the fact that the maple tree can place a single pointer at > > range 0 - 0 without allocating. This is done by changing the index in > > the 'sidetree'. > > > > Re-introduce the entry argument to mas_preallocate() so that a more > > intelligent guess of the node count can be made. > > > > Patches are in the following order: > > 0001-0002: Testing framework for benchmarking some operations > > 0003-0004: Reduction of maple node allocation in sidetree > > 0005: Small cleanup of do_vmi_align_munmap() > > 0006-0013: mas_preallocate() calculation change > > 0014: Change the vma iterator order > I did run The AIM:page_test on an IceLake 48C/96T + 192G RAM platform with > this patchset. > > The result has a little bit improvement: > Base (next-20230602): > 503880 > Base with this patchset: > 519501 > > But they are far from the none-regression result (commit 7be1c1a3c7b1): > 718080 > > > Some other information I collected: > With Base, the mas_alloc_nodes are always hit with request: 7. > With this patchset, the request are 1 or 5. > > I suppose this is the reason for improvement from 503880 to 519501. > > With commit 7be1c1a3c7b1, mas_store_gfp() in do_brk_flags never triggered > mas_alloc_nodes() call. Thanks. Thanks for retesting. I've not been able to see the regression myself. Are you running in a VM of sorts? Android and some cloud VMs seem to see this, but I do not in kvm or the server I test on. I am still looking to reduce/reverse the regression and a reproducer on my end would help. > > > Regards > Yin, Fengwei > > > > > [1] https://lore.kernel.org/linux-mm/202305061457.ac15990c-yujie.liu@xxxxxxxxx/ > > > > Liam R. Howlett (14): > > maple_tree: Add benchmarking for mas_for_each > > maple_tree: Add benchmarking for mas_prev() > > mm: Move unmap_vmas() declaration to internal header > > mm: Change do_vmi_align_munmap() side tree index > > mm: Remove prev check from do_vmi_align_munmap() > > maple_tree: Introduce __mas_set_range() > > mm: Remove re-walk from mmap_region() > > maple_tree: Re-introduce entry to mas_preallocate() arguments > > mm: Use vma_iter_clear_gfp() in nommu > > mm: Set up vma iterator for vma_iter_prealloc() calls > > maple_tree: Move mas_wr_end_piv() below mas_wr_extend_null() > > maple_tree: Update mas_preallocate() testing > > maple_tree: Refine mas_preallocate() node calculations > > mm/mmap: Change vma iteration order in do_vmi_align_munmap() > > > > fs/exec.c | 1 + > > include/linux/maple_tree.h | 23 ++++- > > include/linux/mm.h | 4 - > > lib/maple_tree.c | 78 ++++++++++---- > > lib/test_maple_tree.c | 74 +++++++++++++ > > mm/internal.h | 40 ++++++-- > > mm/memory.c | 16 ++- > > mm/mmap.c | 171 ++++++++++++++++--------------- > > mm/nommu.c | 45 ++++---- > > tools/testing/radix-tree/maple.c | 59 ++++++----- > > 10 files changed, 331 insertions(+), 180 deletions(-) > >