Hi Paul, On Fri, Jun 28, 2019 at 01:04:23PM -0700, Paul E. McKenney wrote: [snip] > > > > Commit > > > > - 23634ebc1d946 ("rcu: Check for wakeup-safe conditions in > > > > rcu_read_unlock_special()") does not trigger the bug within 94 > > > > attempts. > > > > > > > > - 48d07c04b4cc1 ("rcu: Enable elimination of Tree-RCU softirq > > > > processing") needed 12 attempts to trigger the bug. > > > > > > That matches my belief that 23634ebc1d946 ("rcu: Check for wakeup-safe > > > conditions in rcu_read_unlock_special()") will at least greatly decrease > > > the probability of this bug occurring. > > > > I was just typing a reply that I can't reproduce it with: > > rcu: Check for wakeup-safe conditions in rcu_read_unlock_special() > > > > I am trying to revert enough of this patch to see what would break things, > > however I think a better exercise might be to understand more what the patch > > does why it fixes things in the first place ;-) It is probably the > > deferred_qs thing. > > The deferred_qs flag is part of it! Looking forward to hearing what > you come up with as being the critical piece of this commit. The new deferred_qs flag indeed saves the machine from the dead-lock. If we don't want the deferred_qs, then the below patch also fixes the issue. However, I am more sure than not that it does not handle all cases (such as what if we previously had an expedited grace period IPI in a previous reader section and had to to defer processing. Then it seems a similar deadlock would present. But anyway, the below patch does fix it for me! It is based on your -rcu tree commit 23634ebc1d946f19eb112d4455c1d84948875e31 (rcu: Check for wakeup-safe conditions in rcu_read_unlock_special()). ---8<----------------------- From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx> Subject: [PATCH] Fix RCU recursive deadlock Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> --- include/linux/sched.h | 2 +- kernel/rcu/tree_plugin.h | 17 +++++++++++++---- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 942a44c1b8eb..347e6dfcc91b 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -565,7 +565,7 @@ union rcu_special { u8 blocked; u8 need_qs; u8 exp_hint; /* Hint for performance. */ - u8 deferred_qs; + u8 pad; } b; /* Bits. */ u32 s; /* Set of bits. */ }; diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 75110ea75d01..5b9b12c1ba5c 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -455,7 +455,6 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) local_irq_restore(flags); return; } - t->rcu_read_unlock_special.b.deferred_qs = false; if (special.b.need_qs) { rcu_qs(); t->rcu_read_unlock_special.b.need_qs = false; @@ -608,13 +607,24 @@ static void rcu_read_unlock_special(struct task_struct *t) if (preempt_bh_were_disabled || irqs_were_disabled) { t->rcu_read_unlock_special.b.exp_hint = false; // Need to defer quiescent state until everything is enabled. + + /* If unlock_special was called in the current reader section + * just because we were blocked in a previous reader section, + * then raising softirqs can deadlock. This is because the + * scheduler executes RCU sections with preemption disabled, + * however it may have previously blocked in a previous + * non-scheduler reader section and .blocked got set. It is + * never safe to call unlock_special from the scheduler path + * due to recursive wake ups (unless we are in_irq(), so + * prevent this by checking if we were previously blocked. + */ if (irqs_were_disabled && use_softirq && - (in_irq() || !t->rcu_read_unlock_special.b.deferred_qs)) { + (!t->rcu_read_unlock_special.b.blocked || in_irq())) { // Using softirq, safe to awaken, and we get // no help from enabling irqs, unlike bh/preempt. raise_softirq_irqoff(RCU_SOFTIRQ); } else if (irqs_were_disabled && !use_softirq && - !t->rcu_read_unlock_special.b.deferred_qs) { + !t->rcu_read_unlock_special.b.blocked) { // Safe to awaken and we get no help from enabling // irqs, unlike bh/preempt. invoke_rcu_core(); @@ -623,7 +633,6 @@ static void rcu_read_unlock_special(struct task_struct *t) set_tsk_need_resched(current); set_preempt_need_resched(); } - t->rcu_read_unlock_special.b.deferred_qs = true; local_irq_restore(flags); return; } -- 2.22.0.410.gd8fdbe21b5-goog