* David Hildenbrand <david@xxxxxxxxxx> [230124 08:18]: > On 06.01.23 19:36, Liam Howlett wrote: > > * Mike Rapoport <rppt@xxxxxxxxxx> [230106 02:28]: > > > On Thu, Jan 05, 2023 at 04:05:34PM +0000, Liam Howlett wrote: > > > > Preallocations are common in the VMA code to avoid allocating under > > > > certain locking conditions. The preallocations must also cover the > > > > worst-case scenario. Removing the GFP_ZERO flag from the > > > > kmem_cache_alloc() (and bulk variant) calls will reduce the amount of > > > > time spent zeroing memory that may not be used. Only zero out the > > > > necessary area to keep track of the allocations in the maple state. > > > > Zero the entire node prior to using it in the tree. > > > > > > > > This required internal changes to node counting on allocation, so the > > > > test code is also updated. > > > > > > > > This restores some micro-benchmark performance: > > > > up to +9% in mmtests mmap1 by my testing > > > > +10% to +20% in mmap, mmapaddr, mmapmany tests reported by Red Hat > > > > > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636 > > > > Reported-by: Jirka Hladky <jhladky@xxxxxxxxxx> > > > > Suggested-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> > > > > Signed-off-by: Liam Howlett <Liam.Howlett@xxxxxxxxxx> > > > > --- > > > > lib/maple_tree.c | 80 +++++++++++++++++--------------- > > > > tools/testing/radix-tree/maple.c | 18 +++---- > > > > 2 files changed, 52 insertions(+), 46 deletions(-) > > > > > > > > diff --git a/lib/maple_tree.c b/lib/maple_tree.c > > > > index 26e2045d3cda..82a8121fe49b 100644 > > > > --- a/lib/maple_tree.c > > > > +++ b/lib/maple_tree.c > > > > @@ -149,13 +149,12 @@ struct maple_subtree_state { > > > > /* Functions */ > > > > static inline struct maple_node *mt_alloc_one(gfp_t gfp) > > > > { > > > > - return kmem_cache_alloc(maple_node_cache, gfp | __GFP_ZERO); > > > > + return kmem_cache_alloc(maple_node_cache, gfp); > > > > } > > > > static inline int mt_alloc_bulk(gfp_t gfp, size_t size, void **nodes) > > > > { > > > > - return kmem_cache_alloc_bulk(maple_node_cache, gfp | __GFP_ZERO, size, > > > > - nodes); > > > > + return kmem_cache_alloc_bulk(maple_node_cache, gfp, size, nodes); > > > > } > > > > static inline void mt_free_bulk(size_t size, void __rcu **nodes) > > > > @@ -1127,9 +1126,10 @@ static inline struct maple_node *mas_pop_node(struct ma_state *mas) > > > > { > > > > struct maple_alloc *ret, *node = mas->alloc; > > > > unsigned long total = mas_allocated(mas); > > > > + unsigned int req = mas_alloc_req(mas); > > > > /* nothing or a request pending. */ > > > > - if (unlikely(!total)) > > > > + if (WARN_ON(!total)) > > > > > > Hmm, isn't WARN_ON() here too much? > > > > I don't think so. If we get to the point of asking for a node while we > > don't have any to give, then it's too late. It means we (I) have > > calculated the necessary nodes incorrectly and we won't have enough > > memory to fit things into the tree. It should never happen. > > Either way, the suggestion is to use WARN_ON_ONCE() instead of WARN_ON(). > > ... now documented in Documentation/process/coding-style.rst > Thanks, this is good information. I was actually debating a BUG_ON() but it has been documented to that WARN_ON() is better, but WARN_ON_ONCE() is best. I would like the stack trace for any code path that happens to cause it even if there was a previous code path that has already triggered the backtrace. The memory should have all been obtained or entered a previous error path so an OOM issue will not cause a cascade of flooding backtraces. This would also result in a null pointer dereference today, so I think it's safe to say it's rare. Thanks, Liam