Re: [PATCH] KVM: x86: fix emulation of RSM and IRET instructions

Ladi Prosek <lprosek@xxxxxxxxxx> · Wed, 26 Apr 2017 08:34:20 +0200

On Tue, Apr 25, 2017 at 6:38 PM, David Hildenbrand <david@xxxxxxxxxx> wrote:
> On 25.04.2017 16:42, Ladi Prosek wrote:
>> On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
>> on hflags is reverted later on in x86_emulate_instruction where hflags are
>> overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
>> as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.
>>
>> Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
>> an instruction is emulated, this commit deletes emul_flags altogether and
>> makes the emulator access vcpu->arch.hflags using two new accessors. This
>> way all changes, on the emulator side as well as in functions called from
>> the emulator and accessing vcpu state with emul_to_vcpu, are preserved.
>>
>> More details on the bug and its manifestation with Windows and OVMF:
>>
>>   It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
>>   I believe that the SMM part explains why we started seeing this only with
>>   OVMF.
>>
>>   KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
>>   the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
>>   later on in x86_emulate_instruction we overwrite arch.hflags with
>>   ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
>>   The AMD-specific hflag of interest here is HF_NMI_MASK.
>>
>>   When rebooting the system, Windows sends an NMI IPI to all but the current
>>   cpu to shut them down. Only after all of them are parked in HLT will the
>>   initiating cpu finish the restart. If NMI is masked, other cpus never get
>>   the memo and the initiating cpu spins forever, waiting for
>>   hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.
>>
>> Fixes: a584539b24b8 ("KVM: x86: pass the whole hflags field to emulator and back")
>> Signed-off-by: Ladi Prosek <lprosek@xxxxxxxxxx>
> [...]
>>
>> -     if ((ctxt->emul_flags & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
>> +     if ((ctxt->ops->get_hflags(ctxt) & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
>>               ctxt->ops->set_nmi_mask(ctxt, false);
>>
>> -     ctxt->emul_flags &= ~X86EMUL_SMM_INSIDE_NMI_MASK;
>> -     ctxt->emul_flags &= ~X86EMUL_SMM_MASK;
>> +     ctxt->ops->set_hflags(ctxt, ctxt->ops->get_hflags(ctxt) &
>> +             ~X86EMUL_SMM_INSIDE_NMI_MASK &
>> +             ~X86EMUL_SMM_MASK);
>
> Wonder if that would look better with
>         & ~(X86EMUL_SMM_INSIDE_NMI_MASK | X86EMUL_SMM_MASK)

Yes, it definitely would.

> [...]
>>
>>  static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
>> @@ -5314,7 +5326,6 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
>>       BUILD_BUG_ON(HF_GUEST_MASK != X86EMUL_GUEST_MASK);
>>       BUILD_BUG_ON(HF_SMM_MASK != X86EMUL_SMM_MASK);
>>       BUILD_BUG_ON(HF_SMM_INSIDE_NMI_MASK != X86EMUL_SMM_INSIDE_NMI_MASK);
>> -     ctxt->emul_flags = vcpu->arch.hflags;
>>
>>       init_decode_cache(ctxt);
>>       vcpu->arch.emulate_regs_need_sync_from_vcpu = false;
>> @@ -5718,8 +5729,6 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>>               unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
>>               toggle_interruptibility(vcpu, ctxt->interruptibility);
>>               vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
>> -             if (vcpu->arch.hflags != ctxt->emul_flags)
>> -                     kvm_set_hflags(vcpu, ctxt->emul_flags);
>
> I like to see that go.
>
>>               kvm_rip_write(vcpu, ctxt->eip);
>>               if (r == EMULATE_DONE)
>>                       kvm_vcpu_check_singlestep(vcpu, rflags, &r);
>>
>
> Looks very good to me!

Thank you for the review!

> Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>
>
> --
>
> Thanks,
>
> David