On Mon, Aug 17, 2020 at 11:33 AM Raj, Ashok <ashok.raj@xxxxxxxxx> wrote: > > Hi Evan > > Some details below, > > On Mon, Aug 17, 2020 at 11:12:17AM -0700, Evan Green wrote: > > Hi Ashok, > > Thank you, Srikanth, and Sukumar for some very impressive debugging here. > > > > On Fri, Aug 14, 2020 at 2:38 PM Ashok Raj <ashok.raj@xxxxxxxxx> wrote: > > > > > > When offlining CPU's, fixup_irqs() migrates all interrupts away from the > > > outgoing CPU to an online CPU. Its always possible the device sent an > > > interrupt to the previous CPU destination. Pending interrupt bit in IRR in > > > lapic identifies such interrupts. apic_soft_disable() will not capture any > > > new interrupts in IRR. This causes interrupts from device to be lost during > > > cpu offline. The issue was found when explicitly setting MSI affinity to a > > > CPU and immediately offlining it. It was simple to recreate with a USB > > > ethernet device and doing I/O to it while the CPU is offlined. Lost > > > interrupts happen even when Interrupt Remapping is enabled. > > > > > > Current code does apic_soft_disable() before migrating interrupts. > > > > > > native_cpu_disable() > > > { > > > ... > > > apic_soft_disable(); > > > cpu_disable_common(); > > > --> fixup_irqs(); // Too late to capture anything in IRR. > > > } > > > > > > Just fliping the above call sequence seems to hit the IRR checks > > > and the lost interrupt is fixed for both legacy MSI and when > > > interrupt remapping is enabled. > > > > > > > > > Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead") > > > Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@xxxxxxxxxxxxxxxxxxxxxxx/ > > > Signed-off-by: Ashok Raj <ashok.raj@xxxxxxxxx> > > > > > > To: linux-kernel@xxxxxxxxxxxxxxx > > > To: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > > Cc: Sukumar Ghorai <sukumar.ghorai@xxxxxxxxx> > > > Cc: Srikanth Nandamuri <srikanth.nandamuri@xxxxxxxxx> > > > Cc: Evan Green <evgreen@xxxxxxxxxxxx> > > > Cc: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> > > > Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > > Cc: stable@xxxxxxxxxxxxxxx > > > --- > > > arch/x86/kernel/smpboot.c | 11 +++++++++-- > > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > > > index ffbd9a3d78d8..278cc9f92f2f 100644 > > > --- a/arch/x86/kernel/smpboot.c > > > +++ b/arch/x86/kernel/smpboot.c > > > @@ -1603,13 +1603,20 @@ int native_cpu_disable(void) > > > if (ret) > > > return ret; > > > > > > + cpu_disable_common(); > > > /* > > > * Disable the local APIC. Otherwise IPI broadcasts will reach > > > * it. It still responds normally to INIT, NMI, SMI, and SIPI > > > - * messages. > > > > I'm slightly unclear about whether interrupts are disabled at the core > > by this point or not. I followed native_cpu_disable() up to > > __cpu_disable(), up to take_cpu_down(). This is passed into a call to > > stop_machine_cpuslocked(), where interrupts get disabled at the core. > > So unless there's another path, it seems like interrupts are always > > disabled at the core by this point. > > local_irq_disable() just does cli() which allows interrupts to trickle > in to the IRR bits, and once you do sti() things would flow back for > normal interrupt processing. > > > > > > If interrupts are always disabled, then the comment above is a little > > Disable interrupts is different from disabling LAPIC. Once you do the > apic_soft_disable(), there is nothing flowing into the LAPIC except > for INIT, NMI, SMI and SIPI messages. > > This turns off the pipe for all other interrupts to enter LAPIC. Which > is different from doing a cli(). I understand the distinction. I was mostly musing on the difference in behavior your change causes if this function is entered with interrupts enabled at the core (ie sti()). But I think it never is, so that thought is moot. I could never repro the issue reliably on comet lake after Thomas' original fix. But I can still repro it easily on jasper lake. And this patch fixes the issue for me on that platform. Thanks for the fix. Reviewed-by: Evan Green <evgreen@xxxxxxxxxxxx> Tested-by: Evan Green <evgreen@xxxxxxxxxxxx>