Re: [PATCH 3/7] KVM: x86: Add emulation_type to not raise #UD on CPL=3 emulation failure

Liran Alon <LIRAN.ALON@xxxxxxxxxx> · Fri, 22 Dec 2017 03:11:05 +0200

On 21/12/17 17:11, Paolo Bonzini wrote:
On 18/12/2017 10:45, Liran Alon wrote:
Next commits are going introduce support for accessing VMware backdoor
ports even though guest's TSS I/O permissions bitmap doesn't allow
access. This mimic VMware hypervisor behavior.

In order to support this, next commits will change VMX/SVM to
intercept #GP which was raised by such access and handle it by calling
the x86 emulator to emulate instruction. Since commit "KVM: x86:
Always allow access to VMware backdoor I/O ports", the x86 emulator
handles access to these I/O ports by not checking these ports against
the TSS I/O permission bitmap.

It turns out that the x86 emulator is incomplete and therefore
certain instructions that can cause #GP cannot be emulated.
Such an example is "INT x" (opcode 0xcd) which reach emulate_int()
which can only emulate instruction if vCPU is in real-mode.

In those cases, we would like the #GP intercept to just forward #GP
as-is to guest as if there was no intercept to begin with.
However, current emulator code always queue #UD exception in case
emulator fails which is not what is wanted in this flow.

This commit address this issue by adding a new emulation_type flag
that will allow the #GP intercept handler to specify it wish to just
be aware of when instruction emulation fails and doesn't want #UD
exception to be queued.

Signed-off-by: Liran Alon <liran.alon@xxxxxxxxxx>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@xxxxxxxxxx>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@xxxxxxxxxx>
---
  arch/x86/include/asm/kvm_host.h |  1 +
  arch/x86/kvm/x86.c              | 12 ++++++++----
  2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 516798431328..2b7ea1ac4f86 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1168,6 +1168,7 @@ enum emulation_result {
  #define EMULTYPE_SKIP		    (1 << 2)
  #define EMULTYPE_RETRY		    (1 << 3)
  #define EMULTYPE_NO_REEXECUTE	    (1 << 4)
+#define EMULTYPE_NO_UD_ON_FAIL	    (1 << 5)
  int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
  			    int emulation_type, void *insn, int insn_len);

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5fef09674de1..8fd2d3e1bcd4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5425,7 +5425,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
  }
  EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);

-static int handle_emulation_failure(struct kvm_vcpu *vcpu)
+static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
  {
  	int r = EMULATE_DONE;

@@ -5437,7 +5437,11 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu)
  		vcpu->run->internal.ndata = 0;
  		r = EMULATE_USER_EXIT;
  	}
-	kvm_queue_exception(vcpu, UD_VECTOR);
+
+	if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)
+		r = EMULATE_FAIL;

There seems to be some overlap between EMULTYPE_VMWARE and
EMULTYPE_NO_UD_ON_FAIL.

Even if you don't specify EMULTYPE_VMWARE, the emulation could fail, but
there should have been no writeback and injecting the original #GP
exception should be safe.

What you mention is true in case x86_emulate_instruction() fails on 
instruction emulation but it could also fail on instruction disassembly.
(See more details below).

Maybe should EMULTYPE_NO_UD_ON_FAIL return
straight away, skipping even the user-mode exit.

I agree it is possible to check
"if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)"
immediately after statistics & tracing, as if this flag is set, we will 
return EMULATE_FAIL anyway (maybe overwriting EMULATE_USER_EXIT).
If that's important, I can do such change in a v2 of this patch.


On the other hand, you may want to have EMULTYPE_VMWARE so as to reduce
the attack surface from the emulator (as the emulator would then be very
easy to trigger, just by executing an instruction that causes #GP).  In
that case, however, emulation of the {in,out}{s,} instructions shouldn't
fail and you shouldn't need EMULTYPE_NO_UD_ON_FAIL.

Consider the case where the CPU raises a #GP on some instruction which 
is now intercepted by KVM. The #GP intercept will call 
x86_emulate_instruction(). If the x86 emulator disassembly engine is 
incomplete and therefore doesn't know how to parse the instruction which 
caused the #GP, x86_decode_insn() will fail which will also reach 
handle_emulation_failure(). If there is no EMULTYPE_NO_UD_ON_FAIL flag, 
this will cause a #UD exception to be queued which is not what we want.
(We would like to preserve behaviour of raising a #GP to guest)

Therefore we can summarize these flags usage as follows:
1. EMULTYPE_NO_UD_ON_FAIL is used to tell emulator "if you fail to 
disassemble the instruction, I just want you to return failure. Do not 
queue a #UD and let me decide what should be the proper response".
2. EMULTYPE_VMWARE is indeed used to avoid making all instructions that 
could raise #GP to reach instruction-emulation as the x86 emulator is 
incomplete anyway and it just, as you say, increase attack surface.

Having said that, I agree the commit messages of the 2 commits 
introducing these flags may not be indicative enough. If we agree on the 
written above, I can fix them in v2 of this series.

What do you think?

Regards,
-Liran


Paolo


+	else
+		kvm_queue_exception(vcpu, UD_VECTOR);

  	return r;
  }
@@ -5731,7 +5735,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
  				return EMULATE_DONE;
  			if (emulation_type & EMULTYPE_SKIP)
  				return EMULATE_FAIL;
-			return handle_emulation_failure(vcpu);
+			return handle_emulation_failure(vcpu, emulation_type);
  		}
  	}

@@ -5766,7 +5770,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
  					emulation_type))
  			return EMULATE_DONE;

-		return handle_emulation_failure(vcpu);
+		return handle_emulation_failure(vcpu, emulation_type);
  	}

  	if (ctxt->have_exception) {