Re: [PATCH v2 0/6] Introduce __mt_dup() to improve the performance of fork()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Peng Zhang <zhangpeng.00@xxxxxxxxxxxxx> [230830 08:57]:
> In the process of duplicating mmap in fork(), VMAs will be inserted into the new
> maple tree one by one. When inserting into the maple tree, the maple tree will
> be rebalanced multiple times. The rebalancing of maple tree is not as fast as
> the rebalancing of red-black tree and will be slower. Therefore, __mt_dup() is
> introduced to directly duplicate the structure of the old maple tree, and then
> modify each element of the new maple tree. This avoids rebalancing and some extra
> copying, so is faster than the original method.
> More information can refer to [1].

Thanks for this patch set, it's really coming along nicely.

> 
> There is a "spawn" in byte-unixbench[2], which can be used to test the performance
> of fork(). I modified it slightly to make it work with different number of VMAs.
> 
> Below are the test numbers. There are 21 VMAs by default. The first row indicates
> the number of added VMAs. The following two lines are the number of fork() calls
> every 10 seconds. These numbers are different from the test results in v1 because
> this time the benchmark is bound to a CPU. This way the numbers are more stable.
> 
>   Increment of VMAs: 0      100     200     400     800     1600    3200    6400
> 6.5.0-next-20230829: 111878 75531   53683   35282   20741   11317   6110    3158
> Apply this patchset: 114531 85420   64541   44592   28660   16371   9038    4831
>                      +2.37% +13.09% +20.23% +26.39% +38.18% +44.66% +47.92% +52.98%

Can you run this with the default 21 as well?

> 
> Todo:
>   - Update the documentation.
> 
> Changes since v1:
>  - Reimplement __mt_dup() and mtree_dup(). Loops are implemented without using
>    goto instructions.
>  - The new tree also needs to be locked to avoid some lockdep warnings.
>  - Drop and add some helpers.

I guess this also includes the changes to remove the new ways of finding
a node end and using that extra bit in the address?  Those were
significant and welcome changes.  Thanks.

>  - Add test for duplicating full tree.
>  - Drop mas_replace_entry(), it doesn't seem to have a big impact on the
>    performance of fork().
> 
> [1] https://lore.kernel.org/lkml/463899aa-6cbd-f08e-0aca-077b0e4e4475@xxxxxxxxxxxxx/
> [2] https://github.com/kdlucas/byte-unixbench/tree/master
> 
> v1: https://lore.kernel.org/lkml/20230726080916.17454-1-zhangpeng.00@xxxxxxxxxxxxx/
> 
> Peng Zhang (6):
>   maple_tree: Add two helpers
>   maple_tree: Introduce interfaces __mt_dup() and mtree_dup()
>   maple_tree: Add test for mtree_dup()
>   maple_tree: Skip other tests when BENCH is enabled
>   maple_tree: Update check_forking() and bench_forking()
>   fork: Use __mt_dup() to duplicate maple tree in dup_mmap()
> 
>  include/linux/maple_tree.h       |   3 +
>  kernel/fork.c                    |  34 ++-
>  lib/maple_tree.c                 | 277 ++++++++++++++++++++++++-
>  lib/test_maple_tree.c            |  69 +++---
>  mm/mmap.c                        |  14 +-
>  tools/testing/radix-tree/maple.c | 346 +++++++++++++++++++++++++++++++
>  6 files changed, 697 insertions(+), 46 deletions(-)
> 
> -- 
> 2.20.1
> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux