On Fri, Oct 29, 2021, Hou Wenlong wrote: > On Tue, Oct 26, 2021 at 04:37:50PM +0000, Sean Christopherson wrote: > > On Fri, Oct 22, 2021, Hou Wenlong wrote: > > +static int complete_emulated_msr_access(struct kvm_vcpu *vcpu) > > +{ > > + if (vcpu->run->msr.error) { > > + kvm_inject_gp(vcpu, 0); > > + return 1; > > + } > > + > > + return kvm_emulate_instruction(vcpu, EMULTYPE_SKIP); > > +} ... > The note in x86_emulate_instruction() for EMULTYPE_SKIP said that the > caller should be responsible for updating interruptibility state and > injecting single-step #DB. Urgh, yes. And that note also very clear states it's for use only by the vendor callbacks for exactly that reason. > And vendor callbacks for kvm_skip_emulated_instruction() also do some special > things, Luckily, the emulator also does (almost) all those special things. > e.g. I found that sev_es guest just skips RIP updating. Emulation is impossible with sev_es because KVM can't decode the guest code stream, so that particular wrinkle is out of scope. > So it may be more appropriate to add a parameter for skip_emulated_instruction() > callback, which force to use x86_skip_instruction() if the instruction length > is invalid. I really don't like the idea of routing this through kvm_skip_emulated_instruction(), anything originating from the emulator ideally would be handled within the emulator when possible, especially since we know that KVM is going to end up in the emulator anyways. The best idea I can come up with is to add a new emulation type to pair with _SKIP to handle completion of user exits. In theory it should be a tiny code change to add a branch inside the EMULTYPE_SKIP path. On a related topic, I think EMULTYPE_SKIP fails to handled wrapping EIP when the guest has a flat code segment. So this: >From e3511669c40e4d074fb19f43256fc5da8634af14 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc@xxxxxxxxxx> Date: Mon, 1 Nov 2021 09:52:35 -0700 Subject: [PATCH] KVM: x86: Handle 32-bit wrap of EIP for EMULTYPE_SKIP with flat code seg Truncate the new EIP to a 32-bit value when handling EMULTYPE_SKIP as the decode phase does not truncate _eip. Wwrapping the 32-bit boundary is legal if and only if CS is a flat code segment, but that check is implicitly handled in the form of limit checks in the decode phase. Opportunstically prepare for a future fix by storing the result of any truncation in "eip" instead of "_eip". Fixes: 1957aa63be53 ("KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig") Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> --- arch/x86/kvm/x86.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ac83d873d65b..3d7fc5c21ceb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8124,7 +8124,12 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, * updating interruptibility state and injecting single-step #DBs. */ if (emulation_type & EMULTYPE_SKIP) { - kvm_rip_write(vcpu, ctxt->_eip); + if (ctxt->mode != X86EMUL_MODE_PROT64) + ctxt->eip = (u32)ctxt->_eip; + else + ctxt->eip = ctxt->_eip; + + kvm_rip_write(vcpu, ctxt->eip); if (ctxt->eflags & X86_EFLAGS_RF) kvm_set_rflags(vcpu, ctxt->eflags & ~X86_EFLAGS_RF); return 1; -- followed by the rework with complete_emulated_msr_access() doing "EMULTYPE_SKIP | EMULTYPE_COMPLETE_USER_EXIT" with this as the functional change in the emulator: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 3d7fc5c21ceb..13d4758810d1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8118,10 +8118,12 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, return 1; } + /* - * Note, EMULTYPE_SKIP is intended for use *only* by vendor callbacks - * for kvm_skip_emulated_instruction(). The caller is responsible for - * updating interruptibility state and injecting single-step #DBs. + * EMULTYPE_SKIP without is EMULTYPE_COMPLETE_USER_EXIT intended for + * use *only* by vendor callbacks for kvm_skip_emulated_instruction(). + * The caller is responsible for updating interruptibility state and + * injecting single-step #DBs. */ if (emulation_type & EMULTYPE_SKIP) { if (ctxt->mode != X86EMUL_MODE_PROT64) @@ -8129,6 +8131,9 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, else ctxt->eip = ctxt->_eip; + if (emulation_type & EMULTYPE_COMPLETE_USER_EXIT) + goto writeback; + kvm_rip_write(vcpu, ctxt->eip); if (ctxt->eflags & X86_EFLAGS_RF) kvm_set_rflags(vcpu, ctxt->eflags & ~X86_EFLAGS_RF); @@ -8198,6 +8203,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, else r = 1; +writeback: if (writeback) { unsigned long rflags = static_call(kvm_x86_get_rflags)(vcpu); toggle_interruptibility(vcpu, ctxt->interruptibility);