We can relax the correctness of counting of number of queued objects in favor of not hurting performance, by locklessly sampling per-cpu counters. This should be Ok since under high memory pressure, it should not matter if we are off by a few objects while counting. The shrinker will still do the reclaim. Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> --- kernel/rcu/tree.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index dc570dff68d7b..875e7162ddcce 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2916,7 +2916,7 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp) krcp->head = NULL; } - krcp->count = 0; + WRITE_ONCE(krcp->count, 0); /* * One work is per one batch, so there are two "free channels", @@ -3054,7 +3054,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) krcp->head = head; } - krcp->count++; + WRITE_ONCE(krcp->count, krcp->count + 1); // Set timer to drain after KFREE_DRAIN_JIFFIES. if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && @@ -3080,9 +3080,7 @@ kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) for_each_online_cpu(cpu) { struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); - spin_lock_irqsave(&krcp->lock, flags); - count += krcp->count; - spin_unlock_irqrestore(&krcp->lock, flags); + count += READ_ONCE(krcp->count); } return count; -- 2.25.1.481.gfbce0eb801-goog