Re: [PATCH] KVM/x86: Do not clear SIPI while in SMM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 4/17/24 8:40 AM, Igor Mammedov wrote:
On Tue, 16 Apr 2024 19:37:09 -0400
boris.ostrovsky@xxxxxxxxxx wrote:

On 4/16/24 7:17 PM, Sean Christopherson wrote:
On Tue, Apr 16, 2024, boris.ostrovsky@xxxxxxxxxx wrote:
(Sorry, need to resend)

On 4/16/24 6:03 PM, Paolo Bonzini wrote:
On Tue, Apr 16, 2024 at 10:57 PM <boris.ostrovsky@xxxxxxxxxx> wrote:
On 4/16/24 4:53 PM, Paolo Bonzini wrote:
On 4/16/24 22:47, Boris Ostrovsky wrote:
Keeping the SIPI pending avoids this scenario.

This is incorrect - it's yet another ugly legacy facet of x86, but we
have to live with it.  SIPI is discarded because the code is supposed
to retry it if needed ("INIT-SIPI-SIPI").

I couldn't find in the SDM/APM a definitive statement about whether SIPI
is supposed to be dropped.

I think the manual is pretty consistent that SIPIs are never latched,
they're only ever used in wait-for-SIPI state.
The sender should set a flag as early as possible in the SIPI code so
that it's clear that it was not received; and an extra SIPI is not a
problem, it will be ignored anyway and will not cause trouble if
there's a race.

What is the reproducer for this?

Hotplugging/unplugging cpus in a loop, especially if you oversubscribe
the guest, will get you there in 10-15 minutes.

Typically (although I think not always) this is happening when OVMF if
trying to rendezvous and a processor is missing and is sent an extra SMI.

Can you go into more detail? I wasn't even aware that OVMF's SMM
supported hotplug - on real hardware I think there's extra work from
the BMC to coordinate all SMIs across both existing and hotplugged
packages(*)


It's been supported by OVMF for a couple of years (in fact, IIRC you were
part of at least initial conversations about this, at least for the unplug
part).

During hotplug QEMU gathers all cpus in OVMF from (I think)
ich9_apm_ctrl_changed() and they are all waited for in
SmmCpuRendezvous()->SmmWaitForApArrival(). Occasionally it may so happen
that the SMI from QEMU is not delivered to a processor that was *just*
successfully hotplugged and so it is pinged again (https://github.com/tianocore/edk2/blob/fcfdbe29874320e9f876baa7afebc3fca8f4a7df/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c#L304).


At the same time this processor is now being brought up by kernel and is
being sent INIT-SIPI-SIPI. If these (or at least the SIPIs) arrive after the
SMI reaches the processor then that processor is not going to have a good
day.

Do you use qemu/firmware combo that negotiated ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT/
ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT features?

Yes.



It's specifically SIPI that's problematic.  INIT is blocked by SMM, but latched,
and SMIs are blocked by WFS, but latched.  And AFAICT, KVM emulates all of those
combinations correctly.

Why is the SMI from QEMU not delivered?  That seems like the smoking gun.

I haven't actually traced this but it seems that what happens is that cv
the newly-added processor is about to leave SMM and the count of in-SMM
processors is decremented. At the same time, since the processor is
still in SMM the QEMU's SMM is not taken.

And so when the count is looked at again in SmmWaitForApArrival() one
processor is missing.

Current QEMU CPU hotplug workflow with SMM enabled, should be following:

   1. OSPM gets list(N) of hotplugged cpus
   2. OSPM hands over control to firmware (SMM callback leading to SMI broadcast)
   3. Firmware at this point shall initialize all new CPUs (incl. relocating SMBASE for new ones)
      it shall pull in all CPUs that are present at the moment
   4. Firmware returns control to OSPM
   5. OSPM sends Notify to the list(N) CPUs triggering INIT-SIPI-SIPI _only_ on
      those CPUs that it collected in step 1

above steps will repeat until all hotplugged CPUs are handled.

In nutshell INIT-SIPI-SIPI shall not be sent to a freshly hotplugged CPU
that OSPM haven't seen (1) yet _and_ firmware should have initialized (3).

CPUs enumerated at (3) at least shall include CPUs present at (1)
and may include newer CPU arrived in between (1-3).

CPUs collected at (1) shall all get SMM, if it doesn't happen
then hotplug workflow won't work as expected.
In which case we need to figure out why SMM is not delivered
or why firmware isn't waiting for hotplugged CPU.

I noticed that I was using a few months old qemu bits and now I am having trouble reproducing this on latest bits. Let me see if I can get this to fail with latest first and then try to trace why the processor is in this unexpected state.

-boris




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux