On Thu, Sep 14, 2023, Like Xu wrote: > On 2/9/2023 2:56 am, Jim Mattson wrote: > > When the irq_work callback, kvm_pmi_trigger_fn(), is invoked during a > > VM-exit that also invokes __kvm_perf_overflow() as a result of > > instruction emulation, kvm_pmu_deliver_pmi() will be called twice > > before the next VM-entry. > > > > That shouldn't be a problem. The local APIC is supposed to > > As you said, that shouldn't be a problem. It's still a bug though, overflow should only happen once. > > automatically set the mask flag in LVTPC when it handles a PMI, so the > > second PMI should be inhibited. However, KVM's local APIC emulation > > fails to set the mask flag in LVTPC when it handles a PMI, so two PMIs > > are delivered via the local APIC. In the common case, where LVTPC is > > configured to deliver an NMI, the first NMI is vectored through the > > guest IDT, and the second one is held pending. When the NMI handler > > returns, the second NMI is vectored through the IDT. For Linux guests, > > this results in the "dazed and confused" spurious NMI message. > > > > Though the obvious fix is to set the mask flag in LVTPC when handling > > a PMI, KVM's logic around synthesizing a PMI is unnecessarily > > convoluted. > > Any obstruction issues on fixing in this direction ? No, patch 2/2 in this series fixes LVTPC masking bug. I haven't dug through all of this yet, but my gut reaction is that I'm very strongly in favor of not using irq_work just to ensure KVM kicks a vCPU out of HLT. That's just ridiculous.