Re: [PATCH] KVM: x86: fix emulation of RSM and IRET instructions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2017-05-18 15:17 GMT+08:00 Ladi Prosek <lprosek@xxxxxxxxxx>:
> Hi Wanpeng,
>
> On Wed, May 17, 2017 at 9:05 AM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>> Hi Ladi,
>> 2017-04-25 22:42 GMT+08:00 Ladi Prosek <lprosek@xxxxxxxxxx>:
>>> On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
>>> on hflags is reverted later on in x86_emulate_instruction where hflags are
>>> overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
>>> as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.
>>>
>>> Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
>>> an instruction is emulated, this commit deletes emul_flags altogether and
>>> makes the emulator access vcpu->arch.hflags using two new accessors. This
>>> way all changes, on the emulator side as well as in functions called from
>>> the emulator and accessing vcpu state with emul_to_vcpu, are preserved.
>>>
>>> More details on the bug and its manifestation with Windows and OVMF:
>>>
>>>   It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
>>>   I believe that the SMM part explains why we started seeing this only with
>>>   OVMF.
>>>
>>>   KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
>>>   the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
>>>   later on in x86_emulate_instruction we overwrite arch.hflags with
>>>   ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
>>>   The AMD-specific hflag of interest here is HF_NMI_MASK.
>>>
>>>   When rebooting the system, Windows sends an NMI IPI to all but the current
>>>   cpu to shut them down. Only after all of them are parked in HLT will the
>>>   initiating cpu finish the restart. If NMI is masked, other cpus never get
>>>   the memo and the initiating cpu spins forever, waiting for
>>>   hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.
>>>
>>> Fixes: a584539b24b8 ("KVM: x86: pass the whole hflags field to emulator and back")
>>> Signed-off-by: Ladi Prosek <lprosek@xxxxxxxxxx>
>>
>> [snip]
>>
>>>
>>> -       if ((ctxt->emul_flags & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
>>> +       if ((ctxt->ops->get_hflags(ctxt) & X86EMUL_SMM_INSIDE_NMI_MASK) == 0)
>>>                 ctxt->ops->set_nmi_mask(ctxt, false);
>>>
>>> -       ctxt->emul_flags &= ~X86EMUL_SMM_INSIDE_NMI_MASK;
>>> -       ctxt->emul_flags &= ~X86EMUL_SMM_MASK;
>>
>> This will clear SMM related flags for ctxt->emul_flags, so why
>> overwrites arch.hflags with ctxt->emul_flags matters in
>> x86_emulate_instruction? In addition, nmi mask is cleared in the above
>> codes, I didn't find where can set nmi mask again due to SMM unless
>> enter SMM.
>
> The flag that gets overwritten is HF_NMI_MASK.
>
> em_rsm calls ctxt->ops->set_nmi_mask which points to svm_set_nmi_mask on AMD.
> svm_set_nmi_mask modifies vcpu.arch.hflags, but
> x86_emulate_instruction then called kvm_set_hflags(vcpu,
> ctxt->emul_flags) and vcpu.arch.hflags was overwritten with its
> original value (as saved in init_emulate_ctxt) plus whatever changes
> to ctxt->emul_flags were made as part of emulating the instruction.
> The effect of svm_set_nmi_mask on vcpu.arch.hflags was lost.

I see, vmx keeps the nmi mask information by vmx->nmi_known_unmasked
which will not be influenced by this overwritten, thanks for the
explanation.

Regards,
Wanpeng Li




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux