[PATCH] KVM: x86: Inject #UD on "unsupported" hypercall if patching fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Inject a #UD if patching in the correct hypercall fails, e.g. due to
emulator_write_emulated() failing because RIP is mapped not-writable by
the guest.  The guest is likely doomed in any case, but observing a #UD
in the guest is far friendlier to debug/triage than a !WRITABLE #PF with
CR2 pointing at the RIP of the faulting instruction.

Ideally, KVM wouldn't patch at all; it's the guest's responsibility to
identify and use the correct hypercall instruction (VMCALL vs. VMMCALL).
Sadly, older Linux kernels prior to commit c1118b3602c2 ("x86: kvm: use
alternatives for VMCALL vs. VMMCALL if kernel text is read-only") do the
wrong thing and blindly use VMCALL, i.e. removing the patching would
break running VMs with older kernels.

One could argue that KVM should be "fixed" to ignore guest paging
protections instead of injecting #UD, but patching in the first place was
a mistake as it was a hack-a-fix for a guest bug.  There are myriad fatal
issues with KVM's patching:

  1. Patches using an emulated guest write, which will fail if RIP is not
     mapped writable.  This is the issue being mitigated.

  2. Doesn't ensure the write is "atomic", e.g. a hypercall that splits a
     page boundary will be handled as two separate writes, which means
     that a partial, corrupted instruction can be observed by a vCPU.

  3. Doesn't serialize other CPU cores after updating the code stream.

  4. Completely fails to account for the case where KVM is emulating due
     to invalid guest state with unrestricted_guest=0.  Patching and
     retrying the instruction will result in vCPU getting stuck in an
     infinite loop.

But, the "support" _so_ awful, especially #1, that there's practically
zero chance that a modern guest kernel can rely on KVM to patch the guest.
So, rather than proliferate KVM's bad behavior any further than the
absolute minimum needed for backwards compatibility, just try to make it
suck a little less.

Cc: Hou Wenlong <houwenlong93@xxxxxxxxxxxxxxxxx>
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
 arch/x86/kvm/emulate.c |  2 +-
 arch/x86/kvm/x86.c     | 13 +++++++++++--
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 28b1a4e57827..3ccf7b73687f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3734,7 +3734,7 @@ static int em_hypercall(struct x86_emulate_ctxt *ctxt)
 	int rc = ctxt->ops->fix_hypercall(ctxt);
 
 	if (rc != X86EMUL_CONTINUE)
-		return rc;
+		return emulate_ud(ctxt);
 
 	/* Let the processor re-execute the fixed hypercall */
 	ctxt->_eip = ctxt->eip;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 26cb3a4cd0e9..1a844ad873ba 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9026,11 +9026,20 @@ static int emulator_fix_hypercall(struct x86_emulate_ctxt *ctxt)
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 	char instruction[3];
 	unsigned long rip = kvm_rip_read(vcpu);
+	struct x86_exception e;
+	int r;
 
 	static_call(kvm_x86_patch_hypercall)(vcpu, instruction);
 
-	return emulator_write_emulated(ctxt, rip, instruction, 3,
-		&ctxt->exception);
+	/*
+	 * Eat any exceptions, e.g. if RIP is not mapped writable, and simply
+	 * signal failure to the caller.  Faults on the write are (obviously)
+	 * not from the guest, though the guest is likely doomed in any case.
+	 */
+	r = emulator_write_emulated(ctxt, rip, instruction, 3, &e);
+	if (r != X86EMUL_CONTINUE)
+		return X86EMUL_UNHANDLEABLE;
+	return X86EMUL_CONTINUE;
 }
 
 static int dm_request_for_irq_injection(struct kvm_vcpu *vcpu)
-- 
2.34.1.173.g76aa8bc2d0-goog




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux