On Fri, Aug 23, 2013 at 09:33:37AM +0900, Jonghwan Choi wrote: > This patch looks like it should be in the 3.10-stable tree, should we apply > it? Yes please. Thank you! > > ------------------ > > From: "Chuck Anderson <chuck.anderson@xxxxxxxxxx>" > > commit fc78d343fa74514f6fd117b5ef4cd27e4ac30236 upstream > > An older PVHVM guest (v3.0 based) crashed during vCPU hot-plug with: > > kernel BUG at drivers/xen/events.c:1328! > > RCU has detected that a CPU has not entered a quiescent state within the > grace period. It needs to send the CPU a reschedule IPI if it is not > offline. rcu_implicit_offline_qs() does this check: > > /* > * If the CPU is offline, it is in a quiescent state. We can > * trust its state not to change because interrupts are disabled. > */ > if (cpu_is_offline(rdp->cpu)) { > rdp->offline_fqs++; > return 1; > } > > Else the CPU is online. Send it a reschedule IPI. > > The CPU is in the middle of being hot-plugged and has been marked online > (!cpu_is_offline()). See start_secondary(): > > set_cpu_online(smp_processor_id(), true); > ... > per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; > > start_secondary() then waits for the CPU bringing up the hot-plugged CPU to > mark it as active: > > /* > * Wait until the cpu which brought this one up marked it > * online before enabling interrupts. If we don't do that then > * we can end up waking up the softirq thread before this cpu > * reached the active state, which makes the scheduler unhappy > * and schedule the softirq thread on the wrong cpu. This is > * only observable with forced threaded interrupts, but in > * theory it could also happen w/o them. It's just way harder > * to achieve. > */ > while (!cpumask_test_cpu(smp_processor_id(), cpu_active_mask)) > cpu_relax(); > > /* enable local interrupts */ > local_irq_enable(); > > The CPU being hot-plugged will be marked active after it has been fully > initialized by the CPU managing the hot-plug. In the Xen PVHVM case > xen_smp_intr_init() is called to set up the hot-plugged vCPU's > XEN_RESCHEDULE_VECTOR. > > The hot-plugging CPU is marked online, not marked active and does not have > its IPI vectors set up. rcu_implicit_offline_qs() sees the hot-plugging > cpu is !cpu_is_offline() and tries to send it a reschedule IPI: > This will lead to: > > kernel BUG at drivers/xen/events.c:1328! > > xen_send_IPI_one() > xen_smp_send_reschedule() > rcu_implicit_offline_qs() > rcu_implicit_dynticks_qs() > force_qs_rnp() > force_quiescent_state() > __rcu_process_callbacks() > rcu_process_callbacks() > __do_softirq() > call_softirq() > do_softirq() > irq_exit() > xen_evtchn_do_upcall() > > because xen_send_IPI_one() will attempt to use an uninitialized IRQ for > the XEN_RESCHEDULE_VECTOR. > > There is at least one other place that has caused the same crash: > > xen_smp_send_reschedule() > wake_up_idle_cpu() > add_timer_on() > clocksource_watchdog() > call_timer_fn() > run_timer_softirq() > __do_softirq() > call_softirq() > do_softirq() > irq_exit() > xen_evtchn_do_upcall() > xen_hvm_callback_vector() > > clocksource_watchdog() uses cpu_online_mask to pick the next CPU to handle > a watchdog timer: > > /* > * Cycle through CPUs to check if the CPUs stay synchronized > * to each other. > */ > next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask); > if (next_cpu >= nr_cpu_ids) > next_cpu = cpumask_first(cpu_online_mask); > watchdog_timer.expires += WATCHDOG_INTERVAL; > add_timer_on(&watchdog_timer, next_cpu); > > This resulted in an attempt to send an IPI to a hot-plugging CPU that > had not initialized its reschedule vector. One option would be to make > the RCU code check to not check for CPU offline but for CPU active. > As becoming active is done after a CPU is online (in older kernels). > > But Srivatsa pointed out that "the cpu_active vs cpu_online ordering has > been > completely reworked - in the online path, cpu_active is set *before* > cpu_online, > and also, in the cpu offline path, the cpu_active bit is reset in the > CPU_DYING > notification instead of CPU_DOWN_PREPARE." Drilling in this the bring-up > path: "[brought up CPU].. send out a CPU_STARTING notification, and in > response > to that, the scheduler sets the CPU in the cpu_active_mask. Again, this mask > is better left to the scheduler alone, since it has the intelligence to use > it > judiciously." > > The conclusion was that: > " > 1. At the IPI sender side: > > It is incorrect to send an IPI to an offline CPU (cpu not present in > the cpu_online_mask). There are numerous places where we check this > and warn/complain. > > 2. At the IPI receiver side: > > It is incorrect to let the world know of our presence (by setting > ourselves in global bitmasks) until our initialization steps are complete > to such an extent that we can handle the consequences (such as > receiving interrupts without crashing the sender etc.) > " (from Srivatsa) > > As the native code enables the interrupts at some point we need to be > able to service them. In other words a CPU must have valid IPI vectors > if it has been marked online. > > It doesn't need to handle the IPI (interrupts may be disabled) but needs > to have valid IPI vectors because another CPU may find it in cpu_online_mask > and attempt to send it an IPI. > > This patch will change the order of the Xen vCPU bring-up functions so that > Xen vectors have been set up before start_secondary() is called. > It also will not continue to bring up a Xen vCPU if xen_smp_intr_init() > fails > to initialize it. > > Orabug 13823853 > Signed-off-by Chuck Anderson <chuck.anderson@xxxxxxxxxx> > Acked-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > Signed-off-by: Jonghwan Choi <jhbird.choi@xxxxxxxxxxx> > --- > arch/x86/xen/smp.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index d99cae8..a1e58e1 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -667,8 +667,15 @@ static void __init xen_hvm_smp_prepare_cpus(unsigned > int max_cpus) > static int __cpuinit xen_hvm_cpu_up(unsigned int cpu, struct task_struct > *tidle) > { > int rc; > - rc = native_cpu_up(cpu, tidle); > - WARN_ON (xen_smp_intr_init(cpu)); > + /* > + * xen_smp_intr_init() needs to run before native_cpu_up() > + * so that IPI vectors are set up on the booting CPU before > + * it is marked online in native_cpu_up(). > + */ > + rc = xen_smp_intr_init(cpu); > + WARN_ON(rc); > + if (!rc) > + rc = native_cpu_up(cpu, tidle); > return rc; > } > > -- > 1.7.9. > -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html