On Sun, Apr 21, 2024, Michael Roth wrote: > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 85099198a10f..6cf186ed8f66 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -7066,6 +7066,7 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set. > struct kvm_user_vmgexit { > #define KVM_USER_VMGEXIT_PSC_MSR 1 > #define KVM_USER_VMGEXIT_PSC 2 > + #define KVM_USER_VMGEXIT_EXT_GUEST_REQ 3 Assuming we can't get massage this into a vendor agnostic exit, there's gotta be a better name than EXT_GUEST_REQ, which is completely meaningless to me and probably most other people that aren't intimately familar with the specs. Request what? > __u32 type; /* KVM_USER_VMGEXIT_* type */ > union { > struct { > @@ -3812,6 +3813,84 @@ static void snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp > ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err)); > } > > +static int snp_complete_ext_guest_req(struct kvm_vcpu *vcpu) > +{ > + struct vcpu_svm *svm = to_svm(vcpu); > + struct vmcb_control_area *control; > + struct kvm *kvm = vcpu->kvm; > + sev_ret_code fw_err = 0; > + int vmm_ret; > + > + vmm_ret = vcpu->run->vmgexit.ext_guest_req.ret; > + if (vmm_ret) { > + if (vmm_ret == SNP_GUEST_VMM_ERR_INVALID_LEN) > + vcpu->arch.regs[VCPU_REGS_RBX] = > + vcpu->run->vmgexit.ext_guest_req.data_npages; > + goto abort_request; > + } > + > + control = &svm->vmcb->control; > + > + /* > + * To avoid the message sequence number getting out of sync between the > + * actual value seen by firmware verses the value expected by the guest, > + * make sure attestations can't get paused on the write-side at this > + * point by holding the lock for the entire duration of the firmware > + * request so that there is no situation where SNP_GUEST_VMM_ERR_BUSY > + * would need to be returned after firmware sees the request. > + */ > + mutex_lock(&snp_pause_attestation_lock); Why is this in KVM? IIUC, KVM only needs to get involved to translate GFNs to PFNs, the rest can go in the sev-dev driver, no? The whole split is weird, seemingly due to KVM "needing" to take this lock. E.g. why is core kernel code in arch/x86/virt/svm/sev.c at all dealing with attestation goo, when pretty much all of the actual usage is (or can be) in sev-dev.c > + > + if (__snp_transaction_is_stale(svm->snp_transaction_id)) > + vmm_ret = SNP_GUEST_VMM_ERR_BUSY; > + else if (!__snp_handle_guest_req(kvm, control->exit_info_1, > + control->exit_info_2, &fw_err)) > + vmm_ret = SNP_GUEST_VMM_ERR_GENERIC; > + > + mutex_unlock(&snp_pause_attestation_lock); > + > +abort_request: > + ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err)); > + > + return 1; /* resume guest */ > +} > + > +static int snp_begin_ext_guest_req(struct kvm_vcpu *vcpu) > +{ > + int vmm_ret = SNP_GUEST_VMM_ERR_GENERIC; > + struct vcpu_svm *svm = to_svm(vcpu); > + unsigned long data_npages; > + sev_ret_code fw_err; > + gpa_t data_gpa; > + > + if (!sev_snp_guest(vcpu->kvm)) > + goto abort_request; > + > + data_gpa = vcpu->arch.regs[VCPU_REGS_RAX]; > + data_npages = vcpu->arch.regs[VCPU_REGS_RBX]; > + > + if (!IS_ALIGNED(data_gpa, PAGE_SIZE)) > + goto abort_request; > + > + svm->snp_transaction_id = snp_transaction_get_id(); > + if (snp_transaction_is_stale(svm->snp_transaction_id)) { Why bother? I assume this is rare, so why not handle it on the backend, i.e. after userspace does its thing? Then KVM doesn't even have to care about checking for stale IDs, I think? None of this makes much sense to my eyes, e.g. AFAICT, the only thing that can pause attestation is userspace, yet the kernel is responsible for tracking whether or not a transaction is fresh?