On Fri, 2024-11-01 at 09:39 -0700, Sean Christopherson wrote: > On Fri, Nov 01, 2024, Kai Huang wrote: > > On Thu, 2024-10-31 at 07:54 -0700, Sean Christopherson wrote: > > > On Thu, Oct 31, 2024, Kai Huang wrote: > > > - ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl); > > > - if (nr == KVM_HC_MAP_GPA_RANGE && !ret) > > > - /* MAP_GPA tosses the request to the user space. */ > > > - return 0; > > > + r = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, &ret); > > > + if (r <= r) > > > + return r; > > > > ... should be: > > > > if (r <= 0) > > return r; > > > > ? > > > > Another option might be we move "set hypercall return value" code inside > > __kvm_emulate_hypercall(). So IIUC the reason to split > > __kvm_emulate_hypercall() out is for TDX, and while non-TDX uses RAX to carry > > the hypercall return value, TDX uses R10. > > > > We can additionally pass a "kvm_hypercall_set_ret_func" function pointer to > > __kvm_emulate_hypercall(), and invoke it inside. Then we can change > > __kvm_emulate_hypercall() to return: > > < 0 error, > > ==0 return to userspace, > > > 0 go back to guest. > > Hmm, and the caller can still handle kvm_skip_emulated_instruction(), because the > return value is KVM's normal pattern. > > I like it! > > But, there's no need to pass a function pointer, KVM can write (and read) arbitrary > GPRs, it's just avoided in most cases so that the sanity checks and available/dirty > updates are elided. For this code though, it's easy enough to keep kvm_rxx_read() > for getting values, and eating the overhead of a single GPR write is a perfectly > fine tradeoff for eliminating the return multiplexing. > > Lightly tested. Assuming this works for TDX and passes testing, I'll post a > mini-series next week. > > -- > From: Sean Christopherson <seanjc@xxxxxxxxxx> > Date: Fri, 1 Nov 2024 09:04:00 -0700 > Subject: [PATCH] KVM: x86: Refactor __kvm_emulate_hypercall() to accept reg > names, not values > > Rework __kvm_emulate_hypercall() to take the names of input and output > (guest return value) registers, as opposed to taking the input values and > returning the output value. As part of the refactor, change the actual > return value from __kvm_emulate_hypercall() to be KVM's de facto standard > of '0' == exit to userspace, '1' == resume guest, and -errno == failure. > > Using the return value for KVM's control flow eliminates the multiplexed > return value, where '0' for KVM_HC_MAP_GPA_RANGE (and only that hypercall) > means "exit to userspace". > > Use the direct GPR accessors to read values to avoid the pointless marking > of the registers as available, but use kvm_register_write_raw() for the > guest return value so that the innermost helper doesn't need to multiplex > its return value. Using the generic kvm_register_write_raw() adds very > minimal overhead, so as a one-off in a relatively slow path it's well > worth the code simplification. Ah right :-) > > Suggested-by: Kai Huang <kai.huang@xxxxxxxxx> > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > --- I think Binbin can help to test on TDX, and assuming it works, Reviewed-by: Kai Huang <kai.huang@xxxxxxxxx>