> On Thu, 26 Sep 2024 18:22:39 -0700 > Eric Mackay <eric.mackay@xxxxxxxxxx> wrote: > > > On 9/24/24 5:40 AM, Igor Mammedov wrote: > > >> On Fri, 19 Apr 2024 12:17:01 -0400 > > >> boris.ostrovsky@xxxxxxxxxx wrote: > > >> > > >>> On 4/17/24 9:58 AM, boris.ostrovsky@xxxxxxxxxx wrote: > > >>>> > > >>>> I noticed that I was using a few months old qemu bits and now I am > > >>>> having trouble reproducing this on latest bits. Let me see if I can get > > >>>> this to fail with latest first and then try to trace why the processor > > >>>> is in this unexpected state. > > >>> > > >>> Looks like 012b170173bc "system/qdev-monitor: move drain_call_rcu call > > >>> under if (!dev) in qmp_device_add()" is what makes the test to stop failing. > > >>> > > >>> I need to understand whether lack of failures is a side effect of timing > > >>> changes that simply make hotplug fail less likely or if this is an > > >>> actual (but seemingly unintentional) fix. > > >> > > >> Agreed, we should find out culprit of the problem. > > > > > > > > > I haven't been able to spend much time on this unfortunately, Eric is > > > now starting to look at this again. > > > > > > One of my theories was that ich9_apm_ctrl_changed() is sending SMIs to > > > vcpus serially while on HW my understanding is that this is done as a > > > broadcast so I thought this could cause a race. I had a quick test with > > > pausing and resuming all vcpus around the loop but that didn't help. > > > > > > > > >> > > >> PS: > > >> also if you are using AMD host, there was a regression in OVMF > > >> where where vCPU that OSPM was already online-ing, was yanked > > >> from under OSMP feet by OVMF (which depending on timing could > > >> manifest as lost SIPI). > > >> > > >> edk2 commit that should fix it is: > > >> https://github.com/tianocore/edk2/commit/1c19ccd5103b > > >> > > >> Switching to Intel host should rule that out at least. > > >> (or use fixed edk2-ovmf-20240524-5.el10.noarch package from centos, > > >> if you are forced to use AMD host) > > > > I haven't been able to reproduce the issue on an Intel host thus far, > > but it may not be an apples-to-apples comparison because my AMD hosts > > have a much higher core count. > > > > > > > > I just tried with latest bits that include this commit and still was > > > able to reproduce the problem. > > > > > > > > >-boris > > > > The initial hotplug of each CPU appears to complete from the > > perspective of OVMF and OSPM. SMBASE relocation succeeds, and the new > > CPU reports back from the pen. It seems to be the later INIT-SIPI-SIPI > > sequence sent from the guest that doesn't complete. > > > > My working theory has been that some CPU/AP is lagging behind the others > > when the BSP is waiting for all the APs to go into SMM, and the BSP just > > gives up and moves on. Presumably the INIT-SIPI-SIPI is sent while that > > CPU does finally go into SMM, and other CPUs are in normal mode. > > > > I've been able to observe the SMI handler for the problematic CPU will > > sometimes start running when no BSP is elected. This means we have a > > window of time where the CPU will ignore SIPI, and least 1 CPU is in > > normal mode (the BSP) which is capable of sending INIT-SIPI-SIPI from > > the guest. > > I've re-read whole thread and noticed Boris were saying: > > On Tue, Apr 16, 2024 at 10:57 PM <boris.ostrovsky@xxxxxxxxxx> wrote: > > > On 4/16/24 4:53 PM, Paolo Bonzini wrote: > ... > > > > > > > > What is the reproducer for this? > > > > > > Hotplugging/unplugging cpus in a loop, especially if you oversubscribe > > > the guest, will get you there in 10-15 minutes. > ... > > So there was unplug involved as well, which was broken since forever. > > Recent patch > https://patchew.org/QEMU/20230427211013.2994127-1-alxndr@xxxxxx/20230427211013.2994127-2-alxndr@xxxxxx/ > has exposed issue (unexpected uplug/unplug flow) with root cause in OVMF. > Firmware was letting non involved APs run wild in normal mode. > As result AP that was calling _EJ0 and holding ACPI lock was > continuing _EJ0 and releasing ACPI lock, while BSP and a being removed > CPU were still in SMM world. And any other plug/unplug op > were able to grab ACPI lock and trigger another SMI, which breaks > hotplug flow expectations (aka exclusive access to hotplug registers > during plug/unplug op) > Perhaps that's what you are observing. > > Please check if following helps: > https://github.com/kraxel/edk2/commit/738c09f6b5ab87be48d754e62deb72b767415158 > I haven't actually seen the guest crash during unplug, though certainly there have been unplug failures. I haven't been keeping track of the unplug failures as closely, but a test I ran over the weekend with this patch added seemed to show less unplug failures. I'm still getting hotplug failures that cause a guest crash though, so that mystery remains. > So yes, SIPI can be lost (which should be expected as others noted) > but that normally shouldn't be an issue as wakeup_secondary_cpu_via_init() > do resend SIPI. > However if wakeup_secondary_cpu is set to another handler that doesn't > resend SIPI, It might be an issue. We're using wakeup_secondary_cpu_via_init(). acpi_wakeup_cpu() and wakeup_cpu_via_vmgexit(), for example, are a bit opaque to me, so I'm not sure if those code paths include a SIPI resend.