On Fri, Oct 27, 2023 at 07:36:51PM +0100, David Woodhouse wrote: > From: David Woodhouse <dwmw@xxxxxxxxxxxx> > > Xen HVM guests were observed taking triple-faults when attempting to > online a previously offlined vCPU. > > Investigation showed that the fault was coming from a failing call > to lockdep_assert_irqs_disabled(), in load_current_idt() which was > too early in the CPU bringup to actually catch the exception and > report the failure cleanly. > > This was a false positive, caused by acpi_idle_play_dead() setting > the per-cpu hardirqs_enabled flag by calling safe_halt(). Switch it > to use raw_safe_halt() instead, which doesn't do so. > > Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx> > --- > We might {also,instead} explicitly set the hardirqs_enabled flag to > zero when bringing up an AP? So I fixed up the idle paths the other day (see all that __cpuidle stuff) but I've not yet gone through the whole hotplug thing :/ This seems right, at this point everything, including RCU is very much gone, any instrumentation is undesired. Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > > drivers/acpi/processor_idle.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index 3a34a8c425fe..55437f5e0c3a 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -592,7 +592,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, int index) > while (1) { > > if (cx->entry_method == ACPI_CSTATE_HALT) > - safe_halt(); > + raw_safe_halt(); > else if (cx->entry_method == ACPI_CSTATE_SYSTEMIO) { > io_idle(cx->address); > } else > -- > 2.41.0 > >