On Wed, Aug 14, 2019 at 01:22:33PM -0400, Joel Fernandes wrote: > On Wed, Aug 14, 2019 at 10:38:17AM -0400, Joel Fernandes wrote: > > On Tue, Aug 13, 2019 at 12:07:38PM -0700, Paul E. McKenney wrote: > [snip] > > > > - * Queue an RCU callback for lazy invocation after a grace period. > > > > - * This will likely be later named something like "call_rcu_lazy()", > > > > - * but this change will require some way of tagging the lazy RCU > > > > - * callbacks in the list of pending callbacks. Until then, this > > > > - * function may only be called from __kfree_rcu(). > > > > + * Maximum number of kfree(s) to batch, if this limit is hit then the batch of > > > > + * kfree(s) is queued for freeing after a grace period, right away. > > > > */ > > > > -void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) > > > > +struct kfree_rcu_cpu { > > > > + /* The rcu_work node for queuing work with queue_rcu_work(). The work > > > > + * is done after a grace period. > > > > + */ > > > > + struct rcu_work rcu_work; > > > > + > > > > + /* The list of objects being queued in a batch but are not yet > > > > + * scheduled to be freed. > > > > + */ > > > > + struct rcu_head *head; > > > > + > > > > + /* The list of objects that have now left ->head and are queued for > > > > + * freeing after a grace period. > > > > + */ > > > > + struct rcu_head *head_free; > > > > > > So this is not yet the one that does multiple batches concurrently > > > awaiting grace periods, correct? Or am I missing something subtle? > > > > Yes, it is not. I honestly, still did not understand that idea. Or how it > > would improve things. May be we can discuss at LPC on pen and paper? But I > > think that can also be a follow-up optimization. > > I got it now. Basically we can benefit a bit more by having another list > (that is have multiple kfree_rcu batches in flight). I will think more about > it - but hopefully we don't need to gate this patch by that. I am willing to take this as a later optimization. > It'll be interesting to see what rcuperf says about such an improvement :) Indeed, no guarantees either way. The reason for hope assumes a busy system where each grace period is immediately followed by another grace period. On such a system, the current setup allows each CPU to make use only of every second grace period for its kfree_rcu() work. The hope would therefore be that this would reduce the memory footprint substantially with no increase in overhead. But no way to know without trying it! ;-) Thanx, Paul