Re: [PATCH 00/14] Reduce preallocations for maple tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Yin, Fengwei <fengwei.yin@xxxxxxxxx> [230602 04:11]:
> Hi Liam,
> 
> On 6/1/2023 10:15 AM, Liam R. Howlett wrote:
> > Initial work on preallocations showed no regression in performance
> > during testing, but recently some users (both on [1] and off [android]
> > list) have reported that preallocating the worst-case number of nodes
> > has caused some slow down.  This patch set addresses the number of
> > allocations in a few ways.
> > 
> > During munmap() most munmap() operations will remove a single VMA, so
> > leverage the fact that the maple tree can place a single pointer at
> > range 0 - 0 without allocating.  This is done by changing the index in
> > the 'sidetree'.
> > 
> > Re-introduce the entry argument to mas_preallocate() so that a more
> > intelligent guess of the node count can be made.
> > 
> > Patches are in the following order:
> > 0001-0002: Testing framework for benchmarking some operations
> > 0003-0004: Reduction of maple node allocation in sidetree
> > 0005:      Small cleanup of do_vmi_align_munmap()
> > 0006-0013: mas_preallocate() calculation change
> > 0014:      Change the vma iterator order
> I did run The AIM:page_test on an IceLake 48C/96T + 192G RAM platform with
> this patchset.
> 
> The result has a little bit improvement:
> Base (next-20230602):
>   503880
> Base with this patchset:
>   519501
> 
> But they are far from the none-regression result (commit 7be1c1a3c7b1):
>   718080
> 
> 
> Some other information I collected:
> With Base, the mas_alloc_nodes are always hit with request: 7.
> With this patchset, the request are 1 or 5.
> 
> I suppose this is the reason for improvement from 503880 to 519501.
> 
> With commit 7be1c1a3c7b1, mas_store_gfp() in do_brk_flags never triggered
> mas_alloc_nodes() call. Thanks.

Thanks for retesting.  I've not been able to see the regression myself.
Are you running in a VM of sorts?  Android and some cloud VMs seem to
see this, but I do not in kvm or the server I test on.

I am still looking to reduce/reverse the regression and a reproducer on
my end would help.

> 
> 
> Regards
> Yin, Fengwei
> 
> > 
> > [1] https://lore.kernel.org/linux-mm/202305061457.ac15990c-yujie.liu@xxxxxxxxx/
> > 
> > Liam R. Howlett (14):
> >   maple_tree: Add benchmarking for mas_for_each
> >   maple_tree: Add benchmarking for mas_prev()
> >   mm: Move unmap_vmas() declaration to internal header
> >   mm: Change do_vmi_align_munmap() side tree index
> >   mm: Remove prev check from do_vmi_align_munmap()
> >   maple_tree: Introduce __mas_set_range()
> >   mm: Remove re-walk from mmap_region()
> >   maple_tree: Re-introduce entry to mas_preallocate() arguments
> >   mm: Use vma_iter_clear_gfp() in nommu
> >   mm: Set up vma iterator for vma_iter_prealloc() calls
> >   maple_tree: Move mas_wr_end_piv() below mas_wr_extend_null()
> >   maple_tree: Update mas_preallocate() testing
> >   maple_tree: Refine mas_preallocate() node calculations
> >   mm/mmap: Change vma iteration order in do_vmi_align_munmap()
> > 
> >  fs/exec.c                        |   1 +
> >  include/linux/maple_tree.h       |  23 ++++-
> >  include/linux/mm.h               |   4 -
> >  lib/maple_tree.c                 |  78 ++++++++++----
> >  lib/test_maple_tree.c            |  74 +++++++++++++
> >  mm/internal.h                    |  40 ++++++--
> >  mm/memory.c                      |  16 ++-
> >  mm/mmap.c                        | 171 ++++++++++++++++---------------
> >  mm/nommu.c                       |  45 ++++----
> >  tools/testing/radix-tree/maple.c |  59 ++++++-----
> >  10 files changed, 331 insertions(+), 180 deletions(-)
> > 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux