* Peng Zhang <zhangpeng.00@xxxxxxxxxxxxx> [230830 08:57]: > In the process of duplicating mmap in fork(), VMAs will be inserted into the new > maple tree one by one. When inserting into the maple tree, the maple tree will > be rebalanced multiple times. The rebalancing of maple tree is not as fast as > the rebalancing of red-black tree and will be slower. Therefore, __mt_dup() is > introduced to directly duplicate the structure of the old maple tree, and then > modify each element of the new maple tree. This avoids rebalancing and some extra > copying, so is faster than the original method. > More information can refer to [1]. Thanks for this patch set, it's really coming along nicely. > > There is a "spawn" in byte-unixbench[2], which can be used to test the performance > of fork(). I modified it slightly to make it work with different number of VMAs. > > Below are the test numbers. There are 21 VMAs by default. The first row indicates > the number of added VMAs. The following two lines are the number of fork() calls > every 10 seconds. These numbers are different from the test results in v1 because > this time the benchmark is bound to a CPU. This way the numbers are more stable. > > Increment of VMAs: 0 100 200 400 800 1600 3200 6400 > 6.5.0-next-20230829: 111878 75531 53683 35282 20741 11317 6110 3158 > Apply this patchset: 114531 85420 64541 44592 28660 16371 9038 4831 > +2.37% +13.09% +20.23% +26.39% +38.18% +44.66% +47.92% +52.98% Can you run this with the default 21 as well? > > Todo: > - Update the documentation. > > Changes since v1: > - Reimplement __mt_dup() and mtree_dup(). Loops are implemented without using > goto instructions. > - The new tree also needs to be locked to avoid some lockdep warnings. > - Drop and add some helpers. I guess this also includes the changes to remove the new ways of finding a node end and using that extra bit in the address? Those were significant and welcome changes. Thanks. > - Add test for duplicating full tree. > - Drop mas_replace_entry(), it doesn't seem to have a big impact on the > performance of fork(). > > [1] https://lore.kernel.org/lkml/463899aa-6cbd-f08e-0aca-077b0e4e4475@xxxxxxxxxxxxx/ > [2] https://github.com/kdlucas/byte-unixbench/tree/master > > v1: https://lore.kernel.org/lkml/20230726080916.17454-1-zhangpeng.00@xxxxxxxxxxxxx/ > > Peng Zhang (6): > maple_tree: Add two helpers > maple_tree: Introduce interfaces __mt_dup() and mtree_dup() > maple_tree: Add test for mtree_dup() > maple_tree: Skip other tests when BENCH is enabled > maple_tree: Update check_forking() and bench_forking() > fork: Use __mt_dup() to duplicate maple tree in dup_mmap() > > include/linux/maple_tree.h | 3 + > kernel/fork.c | 34 ++- > lib/maple_tree.c | 277 ++++++++++++++++++++++++- > lib/test_maple_tree.c | 69 +++--- > mm/mmap.c | 14 +- > tools/testing/radix-tree/maple.c | 346 +++++++++++++++++++++++++++++++ > 6 files changed, 697 insertions(+), 46 deletions(-) > > -- > 2.20.1 >