call_rcu() changes to save power will slow down percpu refcounter per-CPU to atomic switch path. The primitive uses RCU when switching to atomic mode. The enqueued async callback wakes up waiters waiting in the percpu_ref_switch_waitq. This will slow down the per-CPU refcount users such as blk_pre_runtime_suspend(). Use the call_rcu_flush() API instead which reverts to the old behavior. Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> --- lib/percpu-refcount.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c index e5c5315da274..65c58a029297 100644 --- a/lib/percpu-refcount.c +++ b/lib/percpu-refcount.c @@ -230,7 +230,8 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref, percpu_ref_noop_confirm_switch; percpu_ref_get(ref); /* put after confirmation */ - call_rcu(&ref->data->rcu, percpu_ref_switch_to_atomic_rcu); + call_rcu_flush(&ref->data->rcu, + percpu_ref_switch_to_atomic_rcu); } static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref) -- 2.37.3.998.g577e59143f-goog