On Fri, Sep 22, 2023, Mingwei Zhang wrote: > On Fri, Sep 22, 2023 at 12:21 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > On Fri, Sep 22, 2023, Jim Mattson wrote: > > > On Fri, Sep 22, 2023 at 11:46 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > > > > > On Fri, Sep 01, 2023, Jim Mattson wrote: > > > > > When the irq_work callback, kvm_pmi_trigger_fn(), is invoked during a > > > > > VM-exit that also invokes __kvm_perf_overflow() as a result of > > > > > instruction emulation, kvm_pmu_deliver_pmi() will be called twice > > > > > before the next VM-entry. > > > > > > > > > > That shouldn't be a problem. The local APIC is supposed to > > > > > automatically set the mask flag in LVTPC when it handles a PMI, so the > > > > > second PMI should be inhibited. However, KVM's local APIC emulation > > > > > fails to set the mask flag in LVTPC when it handles a PMI, so two PMIs > > > > > are delivered via the local APIC. In the common case, where LVTPC is > > > > > configured to deliver an NMI, the first NMI is vectored through the > > > > > guest IDT, and the second one is held pending. When the NMI handler > > > > > returns, the second NMI is vectored through the IDT. For Linux guests, > > > > > this results in the "dazed and confused" spurious NMI message. > > > > > > > > > > Though the obvious fix is to set the mask flag in LVTPC when handling > > > > > a PMI, KVM's logic around synthesizing a PMI is unnecessarily > > > > > convoluted. > > > > > > > > To address Like's question about whether not this is necessary, I think we should > > > > rephrase this to explicitly state this is a bug irrespective of the whole LVTPC > > > > masking thing. > > > > > > > > And I think it makes sense to swap the order of the two patches. The LVTPC masking > > > > fix is a clearcut architectural violation. This is a bit more of a grey area, > > > > though still blatantly buggy. > > > > > > The reason I ordered the patches as I did is that when this patch > > > comes first, it actually fixes the problem that was introduced in > > > commit 9cd803d496e7 ("KVM: x86: Update vPMCs when retiring > > > instructions"). If this patch comes second, it's less clear that it > > > fixes a bug, since the other patch renders this one essentially moot. > > > > Yeah, but as Like pointed out, the way the changelog is worded just raises the > > question of why this change is necessary. > > > > I think we should tag them both for stable. They're both bug fixes, regardless > > of the ordering. > > Agree. Both patches are fixing the general potential buggy situation > of multiple PMI injection on one vm entry: one software level defense > (forcing the usage of KVM_REQ_PMI) and one hardware level defense > (preventing PMI injection using mask). > > Although neither patch in this series is fixing the root cause of this > specific double PMI injection bug, I don't see a reason why we cannot > add a "fixes" tag to them, since we may fix it and create it again. > > I am currently working on it and testing my patch. Please give me some > time, I think I could try sending out one version today. Once that is > done, I will combine mine with the existing patch and send it out as a > series. Me confused, what patch? And what does this patch have to do with Jim's series? Unless I've missed something, Jim's patches are good to go with my nits addressed.