Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace generically

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 






On 10/31/2024 4:49 AM, Sean Christopherson wrote:
On Mon, Aug 26, 2024, Binbin Wu wrote:
Check whether a KVM hypercall needs to exit to userspace or not based on
hypercall_exit_enabled field of struct kvm_arch.

Userspace can request a hypercall to exit to userspace for handling by
enable KVM_CAP_EXIT_HYPERCALL and the enabled hypercall will be set in
hypercall_exit_enabled.  Make the check code generic based on it.

Signed-off-by: Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx>
Reviewed-by: Kai Huang <kai.huang@xxxxxxxxx>
---
  arch/x86/kvm/x86.c | 5 +++--
  arch/x86/kvm/x86.h | 4 ++++
  2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 966fb301d44b..e521f14ad2b2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10220,8 +10220,9 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
  	cpl = kvm_x86_call(get_cpl)(vcpu);
ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
-	if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
-		/* MAP_GPA tosses the request to the user space. */
+	/* Check !ret first to make sure nr is a valid KVM hypercall. */
+	if (!ret && user_exit_on_hypercall(vcpu->kvm, nr))
I don't love that the caller has to re-check for user_exit_on_hypercall().
Agree, it is not ideal.

But if __kvm_emulate_hypercall() returns 0 to indicate user exit and 1 to
indicate success, then the callers have to convert the return code to set
return value for guest.  E.g., TDX code also needs to do the conversion.

I also don't love that there's a surprising number of checks lurking in
__kvm_emulate_hypercall(), e.g. that CPL==0, especially since the above comment
about "a valid KVM hypercall" can be intrepreted as meaning KVM is *only* checking
if the hypercall number is valid.

E.g. my initial reaction was that we could add a separate path for userspace
hypercalls, but that would be subtly wrong.  And my second reaction was to hoist
the common checks out of __kvm_emulate_hypercall(), but then I remembered that
the only reason __kvm_emulate_hypercall() is separate is to allow it to be called
by TDX with different source/destionation registers.

So, I'm strongly leaning towards dropping the above change, squashing the addition
of the helper with patch 2, and then landing this on top.

Thoughts?
I have no strong preference and OK with the proposal below.

Just some cases, which don't get the return value right as pointed by Kai
in another thread.
https://lore.kernel.org/kvm/3f158732a66829faaeb527a94b8df78d6173befa.camel@xxxxxxxxx/



--
Subject: [PATCH] KVM: x86: Use '0' in __kvm_emulate_hypercall()  to signal
  "exit to userspace"

Rework __kvm_emulate_hypercall() to use '0' to indicate an exit to
userspace instead of relying on the caller to manually check for success
*and* if user_exit_on_hypercall() is true.  Use '1' for "success" to
(mostly) align with KVM's de factor return codes, where '0' == exit to
userspace, '1' == resume guest, and -errno == failure.  Unfortunately,
some of the PV error codes returned to the guest are postive values, so
the pattern doesn't exactly match KVM's "standard", but it should be close
enough to be intuitive for KVM readers.

Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
  arch/x86/kvm/x86.c | 21 +++++++++++++++------
  1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e09daa3b157c..5fdeb58221e2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10024,7 +10024,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
switch (nr) {
  	case KVM_HC_VAPIC_POLL_IRQ:
-		ret = 0;
+		ret = 1;
  		break;
  	case KVM_HC_KICK_CPU:
  		if (!guest_pv_has(vcpu, KVM_FEATURE_PV_UNHALT))
@@ -10032,7 +10032,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
kvm_pv_kick_cpu_op(vcpu->kvm, a1);
  		kvm_sched_yield(vcpu, a1);
-		ret = 0;
+		ret = 1;
  		break;
  #ifdef CONFIG_X86_64
  	case KVM_HC_CLOCK_PAIRING:
@@ -10050,7 +10050,7 @@ unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr,
  			break;
kvm_sched_yield(vcpu, a0);
-		ret = 0;
+		ret = 1;
  		break;
  	case KVM_HC_MAP_GPA_RANGE: {
  		u64 gpa = a0, npages = a1, attrs = a2;
@@ -10111,12 +10111,21 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
  	cpl = kvm_x86_call(get_cpl)(vcpu);
ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
-	if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
-		/* MAP_GPA tosses the request to the user space. */
+	if (!ret)
  		return 0;
- if (!op_64_bit)
+	/*
+	 * KVM's ABI with the guest is that '0' is success, and any other value
+	 * is an error code.  Internally, '0' == exit to userspace (see above)
+	 * and '1' == success, as KVM's de facto standard return codes are that
+	 * plus -errno == failure.  Explicitly check for '1' as some PV error
+	 * codes are positive values.
+	 */
I didn't understand the last sentence:
"Explicitly check for '1' as some PV error codes are positive values."

The functions called in __kvm_emulate_hypercall() for PV features return
-KVM_EXXX for error code.
Did you mean the functions like kvm_pv_enable_async_pf(), which return
1 for error, would be called in __kvm_emulate_hypercall() in the future?
If this is the concern, then we cannot simply convert 1 to 0 then.

+	if (ret == 1)
+		ret = 0;
+	else if (!op_64_bit)
  		ret = (u32)ret;
+
  	kvm_rax_write(vcpu, ret);
return kvm_skip_emulated_instruction(vcpu);

base-commit: 675248928970d33f7fc8ca9851a170c98f4f1c4f





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux