Re: [PATCH] KVM: x86: fix emulation of RSM and IRET instructions

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Thu, 27 Apr 2017 17:01:38 +0200

On 26/04/2017 08:34, Ladi Prosek wrote:
> On Tue, Apr 25, 2017 at 6:38 PM, David Hildenbrand <david@xxxxxxxxxx> wrote:
>> On 25.04.2017 16:42, Ladi Prosek wrote:
>>> On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
>>> on hflags is reverted later on in x86_emulate_instruction where hflags are
>>> overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
>>> as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.
>>>
>>> Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
>>> an instruction is emulated, this commit deletes emul_flags altogether and
>>> makes the emulator access vcpu->arch.hflags using two new accessors. This
>>> way all changes, on the emulator side as well as in functions called from
>>> the emulator and accessing vcpu state with emul_to_vcpu, are preserved.
>>>
>>> More details on the bug and its manifestation with Windows and OVMF:
>>>
>>>   It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
>>>   I believe that the SMM part explains why we started seeing this only with
>>>   OVMF.
>>>
>>>   KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
>>>   the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
>>>   later on in x86_emulate_instruction we overwrite arch.hflags with
>>>   ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
>>>   The AMD-specific hflag of interest here is HF_NMI_MASK.
>>>
>>>   When rebooting the system, Windows sends an NMI IPI to all but the current
>>>   cpu to shut them down. Only after all of them are parked in HLT will the
>>>   initiating cpu finish the restart. If NMI is masked, other cpus never get
>>>   the memo and the initiating cpu spins forever, waiting for
>>>   hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.
>>>
>>> Fixes: a584539b24b8 ("KVM: x86: pass the whole hflags field to emulator and back")
>>> Signed-off-by: Ladi Prosek <lprosek@xxxxxxxxxx>
>> [...]
>>>
>>> -     if ((ctxt->emul_flags & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
>>> +     if ((ctxt->ops->get_hflags(ctxt) & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
>>>               ctxt->ops->set_nmi_mask(ctxt, false);
>>>
>>> -     ctxt->emul_flags &= ~X86EMUL_SMM_INSIDE_NMI_MASK;
>>> -     ctxt->emul_flags &= ~X86EMUL_SMM_MASK;
>>> +     ctxt->ops->set_hflags(ctxt, ctxt->ops->get_hflags(ctxt) &
>>> +             ~X86EMUL_SMM_INSIDE_NMI_MASK &
>>> +             ~X86EMUL_SMM_MASK);
>>
>> Wonder if that would look better with
>>         & ~(X86EMUL_SMM_INSIDE_NMI_MASK | X86EMUL_SMM_MASK)
> 
> Yes, it definitely would.
> 
>> [...]
>>>
>>>  static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
>>> @@ -5314,7 +5326,6 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
>>>       BUILD_BUG_ON(HF_GUEST_MASK != X86EMUL_GUEST_MASK);
>>>       BUILD_BUG_ON(HF_SMM_MASK != X86EMUL_SMM_MASK);
>>>       BUILD_BUG_ON(HF_SMM_INSIDE_NMI_MASK != X86EMUL_SMM_INSIDE_NMI_MASK);
>>> -     ctxt->emul_flags = vcpu->arch.hflags;
>>>
>>>       init_decode_cache(ctxt);
>>>       vcpu->arch.emulate_regs_need_sync_from_vcpu = false;
>>> @@ -5718,8 +5729,6 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>>>               unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
>>>               toggle_interruptibility(vcpu, ctxt->interruptibility);
>>>               vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
>>> -             if (vcpu->arch.hflags != ctxt->emul_flags)
>>> -                     kvm_set_hflags(vcpu, ctxt->emul_flags);
>>
>> I like to see that go.
>>
>>>               kvm_rip_write(vcpu, ctxt->eip);
>>>               if (r == EMULATE_DONE)
>>>                       kvm_vcpu_check_singlestep(vcpu, rflags, &r);
>>>
>>
>> Looks very good to me!
> 
> Thank you for the review!

Fixed and queued.

Paolo