https://bugzilla.kernel.org/show_bug.cgi?id=217304 --- Comment #4 from Eric Li (lixiaoyi13691419520@xxxxxxxxx) --- 在 2023-04-12星期三的 17:00 +0000,bugzilla-daemon@xxxxxxxxxx写道: > https://bugzilla.kernel.org/show_bug.cgi?id=217304 > > --- Comment #3 from Sean Christopherson (seanjc@xxxxxxxxxx) --- > On Thu, Apr 06, 2023, bugzilla-daemon@xxxxxxxxxx wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=217304 > > > > --- Comment #1 from Sean Christopherson (seanjc@xxxxxxxxxx) --- > > On Thu, Apr 06, 2023, bugzilla-daemon@xxxxxxxxxx wrote: > > > Assume KVM runs in L0, LHV runs in L1, the nested guest runs in > > > L2. > > > > > > The code in LHV performs an experiment (called "Experiment 13" in > > > serial > > > output) on CPU 0 to test the behavior of NMI blocking. The > > > experiment steps > > > are: > > > 1. Prepare state such that the CPU is currently in L1 (LHV), and > > > NMI is > > > blocked > > > 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled > > > (NMI exiting > > = > > > 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI > > > = 0) > > > 3. VM entry to L2 > > > 4. L2 performs VMCALL, get VM exit to L1 > > > 5. L1 checks whether NMI is blocked. > > > > > > The expected behavior is that NMI should be blocked, which is > > > reproduced on > > > real hardware. According to Intel SDM, NMIs should be unblocked > > > after VM > > > entry > > > to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does > > > not change, > > > so > > > NMIs are still unblocked. This behavior is reproducible on real > > > hardware. > > > > > > However, when running on KVM, the experiment shows that at step > > > 5, NMIs are > > > blocked in L1. Thus, I think NMI blocking is not implemented > > > correctly in > > > KVM's > > > nested virtualization. > > > > Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock > > NMIs for all > > other > > exit types. I believe this is the fix (untested): > > > > --- > > arch/x86/kvm/vmx/nested.c | 12 +++++++----- > > 1 file changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > index 96ede74a6067..4240a052628a 100644 > > --- a/arch/x86/kvm/vmx/nested.c > > +++ b/arch/x86/kvm/vmx/nested.c > > @@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct > > kvm_vcpu > > *vcpu) > > nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, > > NMI_VECTOR | INTR_TYPE_NMI_INTR | > > INTR_INFO_VALID_MASK, 0); > > - /* > > - * The NMI-triggered VM exit counts as injection: > > - * clear this one and block further NMIs. > > - */ > > vcpu->arch.nmi_pending = 0; > > - vmx_set_nmi_mask(vcpu, true); > > return 0; > > } > > > > @@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu > > *vcpu, u32 > > vm_exit_reason, > > INTR_INFO_VALID_MASK | > > INTR_TYPE_EXT_INTR; > > } > > > > + /* > > + * NMIs are blocked on VM-Exit due to NMI, and > > unblocked by > > all > > + * other VM-Exit types. > > + */ > > + vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason == > > EXIT_REASON_EXCEPTION_NMI && > > + !is_nmi(vmcs12- > > >vm_exit_intr_info)); > > Ugh, this is wrong. As Eric stated in the bug report, and per > section "27.5.5 > Updating Non-Register State", VM-Exit does *not* affect NMI blocking > except if > the VM-Exit is directly due to an NMI > > Event blocking is affected as follows: > * There is no blocking by STI or by MOV SS after a VM exit. > * VM exits caused directly by non-maskable interrupts (NMIs) > cause blocking > by > NMI (see Table 24-3). Other VM exits do not affect blocking by > NMI. (See > Section 27.1 for the case in which an NMI causes a VM exit > indirectly.) > Correct. In my experiment, NMI is unblocked at VMENTRY. VMEXIT does not change NMI blocking (i.e. remain unblocked). > The scenario here is that virtual NMIs are enabled, in which case > case > VM-Enter, > not VM-Exit, effectively clears NMI blocking. From "26.7.1 > Interruptibility > State": > > The blocking of non-maskable interrupts (NMIs) is determined as > follows: > * If the "virtual NMIs" VM-execution control is 0, NMIs are > blocked if and > only if bit 3 (blocking by NMI) in the interruptibility-state > field is 1. > If the "NMI exiting" VM-execution control is 0, execution of > the IRET > instruction removes this blocking (even if the instruction > generates a > fault). > If the "NMI exiting" control is 1, IRET does not affect this > blocking. > * The following items describe the use of bit 3 (blocking by NMI) > in the > interruptibility-state field if the "virtual NMIs" VM-execution > control > is 1: > * The bit’s value does not affect the blocking of NMIs after > VM entry. > NMIs > are not blocked in VMX non-root operation (except for > ordinary > blocking > for other reasons, such as by the MOV SS instruction, the > wait-for-SIPI > state, etc.) > * The bit’s value determines whether there is virtual-NMI > blocking > after VM > entry. If the bit is 1, virtual-NMI blocking is in effect > after VM > entry. > If the bit is 0, there is no virtual-NMI blocking after VM > entry > unless > the VM entry is injecting an NMI (see Section 26.6.1.1). > Execution of > IRET > removes virtual-NMI blocking (even if the instruction > generates a > fault). > > I.e. forcing NMIs to be unblocked is wrong when virtual NMIs are > disabled. > > Unfortunately, that means fixing this will require a much more > involved patch > (series?), e.g. KVM can't modify NMI blocking until the VM-Enter is > successful, > at which point vmcs02, not vmcs01, is loaded, and so KVM will likely > need to > to track NMI blocking in a software variable. That in turn gets > complicated by > the !vNMI case, because then KVM needs to propagate NMI blocking > between > vmcs01, > vmcs12, and vmcs02. Blech. > Yes, the implementation to handle NMI perfectly in nested virtualization may be complicated. There are many strange cases to think about (e.g. priority between NMI window VM-exit and NMI interrupts). > I'm going to punt fixing this due to lack of bandwidth, and AFAIK > lack of a use > case beyond testing. Hopefully I'll be able to revisit this in a few > weeks, > but > that might be wishful thinking. > I agree. This case probably only appears in testing. I can't think of a reasonable reason for a hypervisor to perform VM-enter with NMIs blocked. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.