On Tue, 21 May 2024 16:49:52 -0500 Michael Roth <michael.roth@xxxxxxx> wrote: > On Tue, May 21, 2024 at 08:49:59AM +0800, Binbin Wu wrote: > > > > > > On 5/17/2024 1:23 AM, Paolo Bonzini wrote: > > > On Thu, May 16, 2024 at 10:29 AM Binbin Wu > > > <binbin.wu@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On 5/1/2024 4:51 PM, Michael Roth wrote: > > > > > SEV-SNP VMs can ask the hypervisor to change the page state > > > > > in the RMP table to be private or shared using the Page State > > > > > Change MSR protocol as defined in the GHCB specification. > > > > > > > > > > When using gmem, private/shared memory is allocated through > > > > > separate pools, and KVM relies on userspace issuing a > > > > > KVM_SET_MEMORY_ATTRIBUTES KVM ioctl to tell the KVM MMU > > > > > whether or not a particular GFN should be backed by private > > > > > memory or not. > > > > > > > > > > Forward these page state change requests to userspace so that > > > > > it can issue the expected KVM ioctls. The KVM MMU will handle > > > > > updating the RMP entries when it is ready to map a private > > > > > page into a guest. > > > > > > > > > > Use the existing KVM_HC_MAP_GPA_RANGE hypercall format to > > > > > deliver these requests to userspace via KVM_EXIT_HYPERCALL. > > > > > > > > > > Signed-off-by: Michael Roth <michael.roth@xxxxxxx> > > > > > Co-developed-by: Brijesh Singh <brijesh.singh@xxxxxxx> > > > > > Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx> > > > > > Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx> > > > > > --- > > > > > arch/x86/include/asm/sev-common.h | 6 ++++ > > > > > arch/x86/kvm/svm/sev.c | 48 > > > > > +++++++++++++++++++++++++++++++ 2 files changed, 54 > > > > > insertions(+) > > > > > > > > > > diff --git a/arch/x86/include/asm/sev-common.h > > > > > b/arch/x86/include/asm/sev-common.h index > > > > > 1006bfffe07a..6d68db812de1 100644 --- > > > > > a/arch/x86/include/asm/sev-common.h +++ > > > > > b/arch/x86/include/asm/sev-common.h @@ -101,11 +101,17 @@ > > > > > enum psc_op { /* GHCBData[11:0] */ > > > > > \ GHCB_MSR_PSC_REQ) > > > > > > > > > > +#define GHCB_MSR_PSC_REQ_TO_GFN(msr) (((msr) & > > > > > GENMASK_ULL(51, 12)) >> 12) +#define > > > > > GHCB_MSR_PSC_REQ_TO_OP(msr) (((msr) & GENMASK_ULL(55, 52)) >> > > > > > 52) + #define GHCB_MSR_PSC_RESP 0x015 > > > > > #define GHCB_MSR_PSC_RESP_VAL(val) \ > > > > > /* GHCBData[63:32] */ \ > > > > > (((u64)(val) & GENMASK_ULL(63, 32)) >> 32) > > > > > > > > > > +/* Set highest bit as a generic error response */ > > > > > +#define GHCB_MSR_PSC_RESP_ERROR (BIT_ULL(63) | > > > > > GHCB_MSR_PSC_RESP) + > > > > > /* GHCB Hypervisor Feature Request/Response */ > > > > > #define GHCB_MSR_HV_FT_REQ 0x080 > > > > > #define GHCB_MSR_HV_FT_RESP 0x081 > > > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > > > > > index e1ac5af4cb74..720775c9d0b8 100644 > > > > > --- a/arch/x86/kvm/svm/sev.c > > > > > +++ b/arch/x86/kvm/svm/sev.c > > > > > @@ -3461,6 +3461,48 @@ static void set_ghcb_msr(struct > > > > > vcpu_svm *svm, u64 value) svm->vmcb->control.ghcb_gpa = value; > > > > > } > > > > > > > > > > +static int snp_complete_psc_msr(struct kvm_vcpu *vcpu) > > > > > +{ > > > > > + struct vcpu_svm *svm = to_svm(vcpu); > > > > > + > > > > > + if (vcpu->run->hypercall.ret) > > > > Do we have definition of ret? I didn't find clear documentation > > > > about it. According to the code, 0 means succssful. Is there > > > > any other error codes need to or can be interpreted? > > > They are defined in include/uapi/linux/kvm_para.h > > > > > > #define KVM_ENOSYS 1000 > > > #define KVM_EFAULT EFAULT /* 14 */ > > > #define KVM_EINVAL EINVAL /* 22 */ > > > #define KVM_E2BIG E2BIG /* 7 */ > > > #define KVM_EPERM EPERM /* 1*/ > > > #define KVM_EOPNOTSUPP 95 > > > > > > Linux however does not expect the hypercall to fail for > > > SEV/SEV-ES; and it will terminate the guest if the PSC operation > > > fails for SEV-SNP. So it's best for userspace if the hypercall > > > always succeeds. :) > > Thanks for the info. > > > > For TDX, it wants to restrict the size of memory range for > > conversion in one hypercall to avoid a too long latency. > > Previously, in TDX QEMU patchset v5, the limitation is in userspace > > and if the size is too big, the status_code will set to > > TDG_VP_VMCALL_RETRY and the failed GPA for guest to retry is > > updated. > > https://lore.kernel.org/all/20240229063726.610065-51-xiaoyao.li@xxxxxxxxx/ > > > > When TDX converts TDVMCALL_MAP_GPA to KVM_HC_MAP_GPA_RANGE, do you > > think which is more reasonable to set the restriction? In KVM (TDX > > specific code) or userspace? > > If userspace is preferred, then the interface needs to be extended > > to support it. > > With SNP we might get a batch of requests in a single GHCB request, > and potentially each of those requests need to get set out to > userspace as a single KVM_HC_MAP_GPA_RANGE. The subsequent patch here > handles that in a loop by issuing a new KVM_HC_MAP_GPA_RANGE via the > completion handler. So we also sort of need to split large requests > into multiple userspace requests in some cases. > > It seems like TDX should be able to do something similar by limiting > the size of each KVM_HC_MAP_GPA_RANGE to TDX_MAP_GPA_MAX_LEN, and then > returning TDG_VP_VMCALL_RETRY to guest if the original size was > greater than TDX_MAP_GPA_MAX_LEN. But at that point you're > effectively done with the entire request and can return to guest, so > it actually seems a little more straightforward than the SNP case > above. E.g. TDX has a 1:1 mapping between TDG_VP_VMCALL_MAP_GPA and > KVM_HC_MAP_GPA_RANGE events. (And even similar names :)) > > So doesn't seem like there's a good reason to expose any of these > throttling details to userspace, in which case existing > KVM_HC_MAP_GPA_RANGE interface seems like it should be sufficient. > Is there any rough data about the latency of private-shared and shared-private page conversion? Thanks, Zhi. > -Mike > > > > > > > > > > > > For TDX, it may also want to use KVM_HC_MAP_GPA_RANGE hypercall > > > > to userspace via KVM_EXIT_HYPERCALL. > > > Yes, definitely. > > > > > > Paolo > > > > > >