Hi Wanpeng, On Wed, May 17, 2017 at 9:05 AM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote: > Hi Ladi, > 2017-04-25 22:42 GMT+08:00 Ladi Prosek <lprosek@xxxxxxxxxx>: >> On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm >> on hflags is reverted later on in x86_emulate_instruction where hflags are >> overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests >> as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu. >> >> Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after >> an instruction is emulated, this commit deletes emul_flags altogether and >> makes the emulator access vcpu->arch.hflags using two new accessors. This >> way all changes, on the emulator side as well as in functions called from >> the emulator and accessing vcpu state with emul_to_vcpu, are preserved. >> >> More details on the bug and its manifestation with Windows and OVMF: >> >> It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD. >> I believe that the SMM part explains why we started seeing this only with >> OVMF. >> >> KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates >> the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because >> later on in x86_emulate_instruction we overwrite arch.hflags with >> ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call. >> The AMD-specific hflag of interest here is HF_NMI_MASK. >> >> When rebooting the system, Windows sends an NMI IPI to all but the current >> cpu to shut them down. Only after all of them are parked in HLT will the >> initiating cpu finish the restart. If NMI is masked, other cpus never get >> the memo and the initiating cpu spins forever, waiting for >> hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe. >> >> Fixes: a584539b24b8 ("KVM: x86: pass the whole hflags field to emulator and back") >> Signed-off-by: Ladi Prosek <lprosek@xxxxxxxxxx> > > [snip] > >> >> - if ((ctxt->emul_flags & X86EMUL_SMM_INSIDE_NMI_MASK) == 0) >> + if ((ctxt->ops->get_hflags(ctxt) & X86EMUL_SMM_INSIDE_NMI_MASK) == 0) >> ctxt->ops->set_nmi_mask(ctxt, false); >> >> - ctxt->emul_flags &= ~X86EMUL_SMM_INSIDE_NMI_MASK; >> - ctxt->emul_flags &= ~X86EMUL_SMM_MASK; > > This will clear SMM related flags for ctxt->emul_flags, so why > overwrites arch.hflags with ctxt->emul_flags matters in > x86_emulate_instruction? In addition, nmi mask is cleared in the above > codes, I didn't find where can set nmi mask again due to SMM unless > enter SMM. The flag that gets overwritten is HF_NMI_MASK. em_rsm calls ctxt->ops->set_nmi_mask which points to svm_set_nmi_mask on AMD. svm_set_nmi_mask modifies vcpu.arch.hflags, but x86_emulate_instruction then called kvm_set_hflags(vcpu, ctxt->emul_flags) and vcpu.arch.hflags was overwritten with its original value (as saved in init_emulate_ctxt) plus whatever changes to ctxt->emul_flags were made as part of emulating the instruction. The effect of svm_set_nmi_mask on vcpu.arch.hflags was lost. Thanks, Ladi