Re: [PATCH 2/4] KVM: nSVM: do not forward NMI window singlestep VM exits to L1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 19, 2017 at 2:50 PM, Ladi Prosek <lprosek@xxxxxxxxxx> wrote:
> On Fri, Jun 16, 2017 at 3:26 PM, Radim Krčmář <rkrcmar@xxxxxxxxxx> wrote:
>> 2017-06-15 13:20+0200, Ladi Prosek:
>>> Nested hypervisor should not see singlestep VM exits if singlestepping
>>> was enabled internally by KVM. Windows is particularly sensitive to this
>>> and known to bluescreen on unexpected VM exits.
>>>
>>> Signed-off-by: Ladi Prosek <lprosek@xxxxxxxxxx>
>>> ---
>>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>>> @@ -966,9 +967,13 @@ static void svm_disable_lbrv(struct vcpu_svm *svm)
>>>  static void disable_nmi_singlestep(struct vcpu_svm *svm)
>>>  {
>>>       svm->nmi_singlestep = false;
>>> -     if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP))
>>> -             svm->vmcb->save.rflags &=
>>> -                     ~(X86_EFLAGS_TF | X86_EFLAGS_RF);
>>> +     if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>>> +             /* Clear our flags if they were not set by the guest */
>>> +             if (!(svm->nmi_singlestep_guest_rflags & X86_EFLAGS_TF))
>>> +                     svm->vmcb->save.rflags &= ~X86_EFLAGS_TF;
>>> +             if (!(svm->nmi_singlestep_guest_rflags & X86_EFLAGS_RF))
>>> +                     svm->vmcb->save.rflags &= ~X86_EFLAGS_RF;
>>
>> IIUC, we intercept/fault on IRET, disable the interception, set TF+RF
>> and enter again, the CPU executes IRET and then we get a #DB exit.
>>
>> IRET pops EFLAGS from before the NMI -- doesn't the CPU properly restore
>> EFLAGS, so we do not need this part here?
>
> My test VM doesn't work without this part, even with the change that
> Paolo suggested in 0/4:
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index d1efe2c62b3f..15a2f7f8e539 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -4622,6 +4622,9 @@ static void enable_nmi_window(struct kvm_vcpu *vcpu)
>         if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_IRET_MASK))
>             == HF_NMI_MASK)
>                 return; /* IRET will cause a vm exit */
> +       if ((svm->vcpu.arch.hflags & (HF_NMI_MASK | HF_GIF_MASK))
> +           == HF_NMI_MASK)
> +               return; /* STGI will cause a vm exit */
>
>
> Let's see what we're singlestepping over (Windows Server 2016 with Hyper-V).
>
> 1. The very first time we enable NMI singlestep is when running
> nested. The nested guest's rip is just after 'pause' in a 'pause; cmp;
> jne' loop. svm_nmi_allowed returns false because nested_svm_nmi
> returns false - we don't want to inject NMI now because
> svm->nested.intercept has the INTERCEPT_NMI bit set.
>
> 2. Then we find ourselves in L1 with the rip just after 'vmrun'.
> svm_nmi_allowed returns false because gif_set returns false (hflags ==
> HF_HIF_MASK | HF_VINTR_MASK). So we, again, enable NMI singlestep
> (without having disabled it yet).
>
> 3. We singlestep over the instruction immediately following 'vmrun'
> (it's a 'mov rax, [rsp+0x20]' in this trace) and finally disable NMI
> singlestep on this vcpu. We inject the NMI when the guest executes
> stgi.
>
>
> I'll see if I can short this out. Setting the trap flag to step over
> an instruction which has no other significance than following a
> 'vmrun' is indeed unnecessary.

Ok, I think I understand it.

First, Paolo's GIF change should be checking only HF_GIF_MASK. Whether
we're currently in an NMI or not does not make a difference, NMI is
not allowed and we have interception in place for when GIF is set so
we don't need to singlestep.

Second, enable_nmi_window can also do nothing if
svm->nested.exit_required. We're not really going to run the guest in
this case so no need to singlestep.

With these two tweaks my test VM does not generate any DB exits. We
still occasionally set TF if we have an NMI and interrupt pending at
the same time (see inject_pending_event where we check
vcpu->arch.interrupt.pending, then later vcpu->arch.nmi_pending) but
we clear it immediately in the new code added in patch 4.

What a complex state machine! I'll prepare v2.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux