Hi Sean, On 8/24/2023 7:52 PM, Sean Christopherson wrote: > +kvm and lkml, didn't realize they weren't Cc'd in the original mail. > > On Thu, Aug 24, 2023, Santosh Shukla wrote: >> Hi Sean, >> >> On 8/23/2023 8:33 PM, Sean Christopherson wrote: >>> On Fri, Aug 18, 2023, 陈培鸿(乘鸿) wrote: >>>> According to the results, I found that in the case of concurrent NMIs, some >>>> NMIs are still injected through eventinj instead of vNMI. >>> >>> Key word is "some", having two NMIs arrive "simultaneously" is uncommon. In quotes >>> because if a vCPU is scheduled out or otherwise delayed, two NMIs can be recognized >>> by KVM at the same time even if there was a sizeable delay between when they were >>> sent. >>> >>>> Based on the above explanations, I summarize my questions as follows: >>>> 1. According to the specification of AMD SVM vNMI, with vNMI enabled, will >>>> some NMIs be injected through eventinj under the condition of concurrent >>>> NMIs? >>> >>> Yes. >>> >>>> 2. If yes, what is the role of vNMI? Is it just an addition to eventinj? What >>>> benefits is it designed to expect? Is there any benchmark data support? >>> >>> Manually injecting NMIs isn't problematic from a performance perspective. KVM >>> takes control of the vCPU, i.e. forces a VM-Exit, to pend a virtual NMI, so there's >>> no extra VM-Exit. >>> >>> The value added by vNMI support is that KVM doesn't need to manually track/detect >>> when NMIs become unblocked in the guest. SVM doesn't provide a hardware-supported >>> NMI-window exiting, so KVM has to intercept _and_ single-step IRET, which adds two >>> VM-Exits for _every_ NMI when vNMI isn't available (and it's a complete mess for >>> things like SEV-ES). >>> >>>> 3. If not, does it mean that the current SVM's vNMI support scheme in the >>>> Linux mainline code is flawed? How should it be fixed? >>> >>> The approach as a whole isn't flawed, it's the best KVM can do given SVM's vNMI >>> architecture and KVM's ABI with respect to "concurrent" NMIs. >>> >>> Hrm, though there does appear to be a bug in the injecting path. KVM doesn't >>> manually set V_NMI_BLOCKING_MASK, and will unnecessarily enable IRET interception >>> when manually injecting a vNMI. Intercepting IRET should be unnecessary because >>> hardware will automatically accept the pending vNMI when NMIs become unblocked. >>> And I don't see anything in the APM that suggests hardware will set V_NMI_BLOCKING_MASK >>> when software directly injects an NMI. >>> >>> So I think we need: >>> >>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >>> index d381ad424554..c956a9f500a2 100644 >>> --- a/arch/x86/kvm/svm/svm.c >>> +++ b/arch/x86/kvm/svm/svm.c >>> @@ -3476,6 +3476,11 @@ static void svm_inject_nmi(struct kvm_vcpu *vcpu) >>> if (svm->nmi_l1_to_l2) >>> return; >>> >>> + if (is_vnmi_enabled(svm)) { >>> + svm->vmcb->control.int_ctl |= V_NMI_BLOCKING_MASK; >>> + return; >>> + } >>> + >>> svm->nmi_masked = true; >>> svm_set_iret_intercept(svm); >>> ++vcpu->stat.nmi_injections; >>> -- >>> >>> or if hardware does set V_NMI_BLOCKING_MASK in this case, just: >>> >> >> Yes, HW does set BLOCKING_MASK when HW takes the pending vNMI event. > > I'm not asking about the pending vNMI case, which is clearly spelled out in the > APM. I'm asking about directly injecting an NMI via: > > svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI; > Yes. This is documented in APM as well. https://www.amd.com/system/files/TechDocs/24593.pdf : "15.21.10 NMI Virtualization" " If Event Injection is used to inject an NMI when NMI Virtualization is enabled, VMRUN sets V_NMI_MASK in the guest state. " >>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >>> index d381ad424554..201a1a33ecd2 100644 >>> --- a/arch/x86/kvm/svm/svm.c >>> +++ b/arch/x86/kvm/svm/svm.c >>> @@ -3473,7 +3473,7 @@ static void svm_inject_nmi(struct kvm_vcpu *vcpu) >>> >>> svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI; >>> >>> - if (svm->nmi_l1_to_l2) >>> + if (svm->nmi_l1_to_l2 || is_vnmi_enabled(svm)) >>> return; >>> >>> svm->nmi_masked = true; >>> -- >>> >> >> Above proposal make sense to me, I was reviewing source code flow >> for a scenarios when two consecutive need to me delivered to Guest. >> Example, process_nmi will pend the first NMI and then second NMI will >> be injected through EVTINJ, as because (kvm_x86_inject_nmi) >> will get called and that will set the _iret_intercept. >> >> With your proposal inject_nmi will be set the env_inj NMI w/o the IRET, >> I think we could check for "is_vnmi_enabled" before the programming >> the "evt_inj"? > > No, because the whole point of this path is to directly inject an NMI when NMIs > are NOT blocked in the guest AND there is already a pending vNMI. I agree, Your patch looks fine. I'll do the testing on above patch and share the test feedback. Thanks, Santosh