----- Original Message ----- > From: "Paolo Bonzini" <pbonzini@xxxxxxxxxx> > To: "Wanpeng Li" <kernellwp@xxxxxxxxx> > Cc: linux-kernel@xxxxxxxxxxxxxxx, "kvm" <kvm@xxxxxxxxxxxxxxx>, yfu@xxxxxxxxxx, "Eduardo Habkost" > <ehabkost@xxxxxxxxxx> > Sent: Monday, November 13, 2017 4:32:09 PM > Subject: Re: [PATCH] KVM: x86: inject exceptions produced by x86_decode_insn > > On 13/11/2017 08:15, Wanpeng Li wrote: > > 2017-11-10 17:49 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>: > >> Sometimes, a processor might execute an instruction while another > >> processor is updating the page tables for that instruction's code page, > >> but before the TLB shootdown completes. The interesting case happens > >> if the page is in the TLB. > >> > >> In general, the processor will succeed in executing the instruction and > >> nothing bad happens. However, what if the instruction is an MMIO access? > >> If *that* happens, KVM invokes the emulator, and the emulator gets the > >> updated page tables. If the update side had marked the code page as non > >> present, the page table walk then will fail and so will x86_decode_insn. > >> > >> Unfortunately, even though kvm_fetch_guest_virt is correctly returning > >> X86EMUL_PROPAGATE_FAULT, x86_decode_insn's caller treats the failure as > >> a fatal error if the instruction cannot simply be reexecuted (as is the > >> case for MMIO). And this in fact happened sometimes when rebooting > >> Windows 2012r2 guests. Just checking ctxt->have_exception and injecting > >> the exception if true is enough to fix the case. > > > > I found the only place which can set ctxt->have_exception is in the > > function x86_emulate_insn(), and x86_decode_insn() will not set > > ctxt->have_exception even if kvm_fetch_guest_virt() returns > > X86_EMUL_PROPAGATE_FAULT. > > Hmm, you're right. Looks like Yanan has been (un)lucky when trying out > this patch! :( > > Yanan, can you double check that you can reproduce the issue with an > unpatched kernel? I will work on a kvm-unit-tests testcsae Hi Paolo, Yes, i still can reproduce it. In the latest acceptance testing which i just finished this afternoon, 7 cases failed as this problem (all for win2012.r2 guest) And, with the scratch build that be provides in bz 1493501, i repeat 30 times, it is ok. Thanks ! Best Wishes Yanan Fu > > Paolo > > > Regards, > > Wanpeng Li > > > >> > >> Thanks to Eduardo Habkost for helping in the debugging of this issue. > >> > >> Reported-by: Yanan Fu <yfu@xxxxxxxxxx> > >> Cc: Eduardo Habkost <ehabkost@xxxxxxxxxx> > >> Cc: stable@xxxxxxxxxxxxxxx > >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > >> --- > >> arch/x86/kvm/x86.c | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > >> index 34c85aa2e2d1..6dbed9022797 100644 > >> --- a/arch/x86/kvm/x86.c > >> +++ b/arch/x86/kvm/x86.c > >> @@ -5722,6 +5722,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, > >> if (reexecute_instruction(vcpu, cr2, > >> write_fault_to_spt, > >> emulation_type)) > >> return EMULATE_DONE; > >> + if (ctxt->have_exception && > >> inject_emulated_exception(vcpu)) > >> + return EMULATE_DONE; > >> if (emulation_type & EMULTYPE_SKIP) > >> return EMULATE_FAIL; > >> return handle_emulation_failure(vcpu); > >> -- > >> 1.8.3.1 > >> > >