The patch titled Subject: maple_tree: add mt_free_one() and mt_attr() helpers has been added to the -mm mm-unstable branch. Its filename is maple_tree-add-mt_free_one-and-mt_attr-helpers.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/maple_tree-add-mt_free_one-and-mt_attr-helpers.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Peng Zhang <zhangpeng.00@xxxxxxxxxxxxx> Subject: maple_tree: add mt_free_one() and mt_attr() helpers Date: Fri, 27 Oct 2023 11:38:36 +0800 Patch series "Introduce __mt_dup() to improve the performance of fork()", v7. This series introduces __mt_dup() to improve the performance of fork(). During the duplication process of mmap, all VMAs are traversed and inserted one by one into the new maple tree, causing the maple tree to be rebalanced multiple times. Balancing the maple tree is a costly operation. To duplicate VMAs more efficiently, mtree_dup() and __mt_dup() are introduced for the maple tree. They can efficiently duplicate a maple tree. Here are some algorithmic details about {mtree,__mt}_dup(). We perform a DFS pre-order traversal of all nodes in the source maple tree. During this process, we fully copy the nodes from the source tree to the new tree. This involves memory allocation, and when encountering a new node, if it is a non-leaf node, all its child nodes are allocated at once. This idea was originally from Liam R. Howlett's Maple Tree Work email, and I added some of my own ideas to implement it. Some previous discussions can be found in [1]. For a more detailed analysis of the algorithm, please refer to the logs for patch [3/10] and patch [10/10]. There is a "spawn" in byte-unixbench[2], which can be used to test the performance of fork(). I modified it slightly to make it work with different number of VMAs. Below are the test results. The first row shows the number of VMAs. The second and third rows show the number of fork() calls per ten seconds, corresponding to next-20231006 and the this patchset, respectively. The test results were obtained with CPU binding to avoid scheduler load balancing that could cause unstable results. There are still some fluctuations in the test results, but at least they are better than the original performance. 21 121 221 421 821 1621 3221 6421 12821 25621 51221 112100 76261 54227 34035 20195 11112 6017 3161 1606 802 393 114558 83067 65008 45824 28751 16072 8922 4747 2436 1233 599 2.19% 8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42% Thanks to Liam and Matthew for the review. This patch (of 10): Add two helpers: 1. mt_free_one(), used to free a maple node. 2. mt_attr(), used to obtain the attributes of maple tree. Link: https://lkml.kernel.org/r/20231027033845.90608-1-zhangpeng.00@xxxxxxxxxxxxx Link: https://lkml.kernel.org/r/20231027033845.90608-2-zhangpeng.00@xxxxxxxxxxxxx Signed-off-by: Peng Zhang <zhangpeng.00@xxxxxxxxxxxxx> Reviewed-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> Cc: Christian Brauner <brauner@xxxxxxxxxx> Cc: Jonathan Corbet <corbet@xxxxxxx> Cc: Mateusz Guzik <mjguzik@xxxxxxxxx> Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> Cc: Mike Christie <michael.christie@xxxxxxxxxx> Cc: Nicholas Piggin <npiggin@xxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- lib/maple_tree.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) --- a/lib/maple_tree.c~maple_tree-add-mt_free_one-and-mt_attr-helpers +++ a/lib/maple_tree.c @@ -165,6 +165,11 @@ static inline int mt_alloc_bulk(gfp_t gf return kmem_cache_alloc_bulk(maple_node_cache, gfp, size, nodes); } +static inline void mt_free_one(struct maple_node *node) +{ + kmem_cache_free(maple_node_cache, node); +} + static inline void mt_free_bulk(size_t size, void __rcu **nodes) { kmem_cache_free_bulk(maple_node_cache, size, (void **)nodes); @@ -205,6 +210,11 @@ static unsigned int mas_mt_height(struct return mt_height(mas->tree); } +static inline unsigned int mt_attr(struct maple_tree *mt) +{ + return mt->ma_flags & ~MT_FLAGS_HEIGHT_MASK; +} + static inline enum maple_type mte_node_type(const struct maple_enode *entry) { return ((unsigned long)entry >> MAPLE_NODE_TYPE_SHIFT) & @@ -5573,7 +5583,7 @@ void mas_destroy(struct ma_state *mas) mt_free_bulk(count, (void __rcu **)&node->slot[1]); total -= count; } - kmem_cache_free(maple_node_cache, node); + mt_free_one(ma_mnode_ptr(node)); total--; } _ Patches currently in -mm which might be from zhangpeng.00@xxxxxxxxxxxxx are maple_tree-add-mt_free_one-and-mt_attr-helpers.patch maple_tree-introduce-mtreemas_lock_nested.patch maple_tree-introduce-interfaces-__mt_dup-and-mtree_dup.patch radix-tree-test-suite-align-kmem_cache_alloc_bulk-with-kernel-behavior.patch maple_tree-add-test-for-mtree_dup.patch maple_tree-update-the-documentation-of-maple-tree.patch maple_tree-skip-other-tests-when-bench-is-enabled.patch maple_tree-update-check_forking-and-bench_forking.patch maple_tree-preserve-the-tree-attributes-when-destroying-maple-tree.patch fork-use-__mt_dup-to-duplicate-maple-tree-in-dup_mmap.patch