> Recently a discussion about stability and performance of a system > involving a high rate of kfree_rcu() calls surfaced on the list [1] > which led to another discussion how to prepare for this situation. > > This patch adds basic batching support for kfree_rcu(). It is "basic" > because we do none of the slab management, dynamic allocation, code > moving or any of the other things, some of which previous attempts did > [2]. These fancier improvements can be follow-up patches and there are > different ideas being discussed in those regards. This is an effort to > start simple, and build up from there. In the future, an extension to > use kfree_bulk and possibly per-slab batching could be done to further > improve performance due to cache-locality and slab-specific bulk free > optimizations. By using an array of pointers, the worker thread > processing the work would need to read lesser data since it does not > need to deal with large rcu_head(s) any longer. > > Torture tests follow in the next patch and show improvements of around > 5x reduction in number of grace periods on a 16 CPU system. More > details and test data are in that patch. > > There is an implication with rcu_barrier() with this patch. Since the > kfree_rcu() calls can be batched, and may not be handed yet to the RCU > machinery in fact, the monitor may not have even run yet to do the > queue_rcu_work(), there seems no easy way of implementing rcu_barrier() > to wait for those kfree_rcu()s that are already made. So this means a > kfree_rcu() followed by an rcu_barrier() does not imply that memory will > be freed once rcu_barrier() returns. > > Another implication is higher active memory usage (although not > run-away..) until the kfree_rcu() flooding ends, in comparison to > without batching. More details about this are in the second patch which > adds an rcuperf test. > > Finally, in the near future we will get rid of kfree_rcu() special casing > within RCU such as in rcu_do_batch and switch everything to just > batching. Currently we don't do that since timer subsystem is not yet up > and we cannot schedule the kfree_rcu() monitor as the timer subsystem's > lock are not initialized. That would also mean getting rid of > kfree_call_rcu_nobatch() entirely. > Hello, Joel. First of all thank you for improving it. I also noticed a high pressure on RCU-machinery during performing some vmalloc tests when kfree_rcu() flood occurred. Therefore i got rid of using kfree_rcu() there. I have just a small question related to workloads and performance evaluation. Are you aware of any specific workloads which benefit from it for example mobile area, etc? I am asking because i think about backporting of it and reuse it on our kernel. Thank you! -- Vlad Rezki