[Bug 217304] KVM does not handle NMI blocking correctly in nested virtualization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=217304

--- Comment #4 from Eric Li (lixiaoyi13691419520@xxxxxxxxx) ---
在 2023-04-12星期三的 17:00 +0000,bugzilla-daemon@xxxxxxxxxx写道:
> https://bugzilla.kernel.org/show_bug.cgi?id=217304
> 
> --- Comment #3 from Sean Christopherson (seanjc@xxxxxxxxxx) ---
> On Thu, Apr 06, 2023, bugzilla-daemon@xxxxxxxxxx wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=217304
> > 
> > --- Comment #1 from Sean Christopherson (seanjc@xxxxxxxxxx) ---
> > On Thu, Apr 06, 2023, bugzilla-daemon@xxxxxxxxxx wrote:
> > > Assume KVM runs in L0, LHV runs in L1, the nested guest runs in
> > > L2.
> > > 
> > > The code in LHV performs an experiment (called "Experiment 13" in
> > > serial
> > > output) on CPU 0 to test the behavior of NMI blocking. The
> > > experiment steps
> > > are:
> > > 1. Prepare state such that the CPU is currently in L1 (LHV), and
> > > NMI is
> > > blocked
> > > 2. Modify VMCS12 to make sure that L2 has virtual NMIs enabled
> > > (NMI exiting
> > =
> > > 1, Virtual NMIs = 1), and L2 does not block NMI (Blocking by NMI
> > > = 0)
> > > 3. VM entry to L2
> > > 4. L2 performs VMCALL, get VM exit to L1
> > > 5. L1 checks whether NMI is blocked.
> > > 
> > > The expected behavior is that NMI should be blocked, which is
> > > reproduced on
> > > real hardware. According to Intel SDM, NMIs should be unblocked
> > > after VM
> > > entry
> > > to L2 (step 3). After VM exit to L1 (step 4), NMI blocking does
> > > not change,
> > > so
> > > NMIs are still unblocked. This behavior is reproducible on real
> > > hardware.
> > > 
> > > However, when running on KVM, the experiment shows that at step
> > > 5, NMIs are
> > > blocked in L1. Thus, I think NMI blocking is not implemented
> > > correctly in
> > > KVM's
> > > nested virtualization.
> > 
> > Ya, KVM blocks NMIs on nested NMI VM-Exits, but doesn't unblock
> > NMIs for all
> > other
> > exit types.  I believe this is the fix (untested):
> > 
> > ---
> >  arch/x86/kvm/vmx/nested.c | 12 +++++++-----
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 96ede74a6067..4240a052628a 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -4164,12 +4164,7 @@ static int vmx_check_nested_events(struct
> > kvm_vcpu
> > *vcpu)
> >                 nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
> >                                   NMI_VECTOR | INTR_TYPE_NMI_INTR |
> >                                   INTR_INFO_VALID_MASK, 0);
> > -               /*
> > -                * The NMI-triggered VM exit counts as injection:
> > -                * clear this one and block further NMIs.
> > -                */
> >                 vcpu->arch.nmi_pending = 0;
> > -               vmx_set_nmi_mask(vcpu, true);
> >                 return 0;
> >         }
> > 
> > @@ -4865,6 +4860,13 @@ void nested_vmx_vmexit(struct kvm_vcpu
> > *vcpu, u32
> > vm_exit_reason,
> >                                 INTR_INFO_VALID_MASK |
> > INTR_TYPE_EXT_INTR;
> >                 }
> > 
> > +               /*
> > +                * NMIs are blocked on VM-Exit due to NMI, and
> > unblocked by
> > all
> > +                * other VM-Exit types.
> > +                */
> > +               vmx_set_nmi_mask(vcpu, (u16)vm_exit_reason ==
> > EXIT_REASON_EXCEPTION_NMI &&
> > +                                      !is_nmi(vmcs12-
> > >vm_exit_intr_info));
> 
> Ugh, this is wrong.  As Eric stated in the bug report, and per
> section "27.5.5
> Updating Non-Register State", VM-Exit does *not* affect NMI blocking
> except if
> the VM-Exit is directly due to an NMI
> 
>   Event blocking is affected as follows:
>     * There is no blocking by STI or by MOV SS after a VM exit.
>     * VM exits caused directly by non-maskable interrupts (NMIs)
> cause blocking
> by
>       NMI (see Table 24-3). Other VM exits do not affect blocking by
> NMI. (See
>       Section 27.1 for the case in which an NMI causes a VM exit
> indirectly.)
> 
Correct. In my experiment, NMI is unblocked at VMENTRY. VMEXIT does not
change NMI blocking (i.e. remain unblocked).

> The scenario here is that virtual NMIs are enabled, in which case
> case
> VM-Enter,
> not VM-Exit, effectively clears NMI blocking.  From "26.7.1
> Interruptibility
> State":
> 
>   The blocking of non-maskable interrupts (NMIs) is determined as
> follows:
>     * If the "virtual NMIs" VM-execution control is 0, NMIs are
> blocked if and
>       only if bit 3 (blocking by NMI) in the interruptibility-state
> field is 1.
>       If the "NMI exiting" VM-execution control is 0, execution of
> the IRET
>       instruction removes this blocking (even if the instruction
> generates a
> fault).
>       If the "NMI exiting" control is 1, IRET does not affect this
> blocking.
>     * The following items describe the use of bit 3 (blocking by NMI)
> in the
>       interruptibility-state field if the "virtual NMIs" VM-execution
> control
> is 1:
>         * The bit’s value does not affect the blocking of NMIs after
> VM entry.
> NMIs
>           are not blocked in VMX non-root operation (except for
> ordinary
> blocking
>           for other reasons, such as by the MOV SS instruction, the
> wait-for-SIPI
>           state, etc.)
>         * The bit’s value determines whether there is virtual-NMI
> blocking
> after VM
>           entry. If the bit is 1, virtual-NMI blocking is in effect
> after VM
> entry.
>           If the bit is 0, there is no virtual-NMI blocking after VM
> entry
> unless
>           the VM entry is injecting an NMI (see Section 26.6.1.1).
> Execution of
> IRET
>           removes virtual-NMI blocking (even if the instruction
> generates a
> fault).
> 
> I.e. forcing NMIs to be unblocked is wrong when virtual NMIs are
> disabled.
> 
> Unfortunately, that means fixing this will require a much more
> involved patch
> (series?), e.g. KVM can't modify NMI blocking until the VM-Enter is
> successful,
> at which point vmcs02, not vmcs01, is loaded, and so KVM will likely
> need to
> to track NMI blocking in a software variable.  That in turn gets
> complicated by
> the !vNMI case, because then KVM needs to propagate NMI blocking
> between
> vmcs01,
> vmcs12, and vmcs02.  Blech.
> 
Yes, the implementation to handle NMI perfectly in nested
virtualization may be complicated. There are many strange cases to
think about (e.g. priority between NMI window VM-exit and NMI
interrupts).

> I'm going to punt fixing this due to lack of bandwidth, and AFAIK
> lack of a use
> case beyond testing.  Hopefully I'll be able to revisit this in a few
> weeks,
> but
> that might be wishful thinking.
> 
I agree. This case probably only appears in testing. I can't think of a
reasonable reason for a hypervisor to perform VM-enter with NMIs
blocked.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux