On 11/30/2021 9:51 PM, Frederic Weisbecker wrote:
While looping through the rnp's CPUs to IPI for an expedited grace period, a first pass excludes the current CPU and the CPUs in dynticks idle mode. The workqueue will report their QS on their behalf later. The second pass processes the IPIs and also ignores the current CPU, assuming it has been previously included in the group of CPUs whose QS are to be reported by the workqueue. Unfortunately the current CPU may have changed between the first and second pass, due to the rnp lock being dropped, re-enabling preemption. As a result the current CPU, if different in the second pass, may be ignored by the expedited grace period. No IPI will be sent to it so it won't be requested to report an expedited quiescent state. This ends up in an expedited grace period stall. Fix this with including the current CPU in the second round in the group of CPUs to report a QS for by the workqueue. Fixes: b9ad4d6ed18e ("rcu: Avoid self-IPI in sync_rcu_exp_select_node_cpus()") Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx> Cc: Uladzislau Rezki <urezki@xxxxxxxxx> Cc: Neeraj Upadhyay <quic_neeraju@xxxxxxxxxxx> Cc: Boqun Feng <boqun.feng@xxxxxxxxx> Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx> Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> ---
Reviewed-by: Neeraj Upadhyay <quic_neeraju@xxxxxxxxxxx> Thanks Neeraj
kernel/rcu/tree_exp.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index a96d17206d87..237a79989aba 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -387,6 +387,7 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp) continue; } if (get_cpu() == cpu) { + mask_ofl_test |= mask; put_cpu(); continue; }