On 2/24/25 16:55, Sean Christopherson wrote: > On Mon, Feb 24, 2025, Tom Lendacky wrote: >> On 2/18/25 19:26, Sean Christopherson wrote: >>> -void pre_sev_run(struct vcpu_svm *svm, int cpu) >>> +int pre_sev_run(struct vcpu_svm *svm, int cpu) >>> { >>> struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu); >>> - unsigned int asid = sev_get_asid(svm->vcpu.kvm); >>> + struct kvm *kvm = svm->vcpu.kvm; >>> + unsigned int asid = sev_get_asid(kvm); >>> + >>> + /* >>> + * Terminate the VM if userspace attempts to run the vCPU with an >>> + * invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after >>> + * an SNP AP Destroy event. >>> + */ >>> + if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) { >>> + kvm_vm_dead(kvm); >>> + return -EIO; >>> + } >> >> If a VMRUN is performed with the vmsa_pa value set to INVALID_PAGE, the >> VMRUN will fail and KVM will dump the VMCB and exit back to userspace > > I haven't tested, but based on what the APM says, I'm pretty sure this would crash > the host due to a #GP on VMRUN, i.e. due to the resulting kvm_spurious_fault(). > > IF (rAX contains an unsupported physical address) > EXCEPTION [#GP] Well that's for the VMCB, the VMSA is pointed to by the VMCB and results in a VMEXIT code of -1 if you don't supply a proper page-aligned, physical address. > >> with KVM_EXIT_INTERNAL_ERROR. >> >> Is doing this preferrable to that? > > Even if AMD guaranteed that the absolute worst case scenario is a failed VMRUN > with zero side effects, doing VMRUN with a bad address should be treated as a > KVM bug. Fair. > >> If so, should a vcpu_unimpl() message be issued, too, to better identify the >> reason for marking the VM dead? > > My vote is no. At some point we need to assume userspace possesess a reasonable > level of competency and sanity. > >>> static void svm_inject_nmi(struct kvm_vcpu *vcpu) >>> @@ -4231,7 +4233,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, >>> if (force_immediate_exit) >>> smp_send_reschedule(vcpu->cpu); >>> >>> - pre_svm_run(vcpu); >>> + if (pre_svm_run(vcpu)) >>> + return EXIT_FASTPATH_EXIT_USERSPACE; In testing this out, I think userspace continues on because I eventually get: KVM_GET_PIT2 failed: Input/output error /tmp/cmdline.98112: line 1: 98163 Aborted (core dumped) ... Haven't looked too close, but maybe an exit_reason needs to be set to get qemu to quit sooner? Thanks, Tom >> >> Since the return code from pre_svm_run() is never used, should it just >> be a bool function, then? > > Hard no. I strongly dislike boolean returns for functions that aren't obviously > predicates, because it's impossible to determine the polarity of the return value > based solely on the prototype. This leads to bugs that are easier to detect with > 0/-errno return, e.g. returning -EINVAL in a happy path stands out more than > returning the wrong false/true value. > > Case in point (because I just responded to another emain about this function), > what's the polarity of this helper? :-) > > static bool sanity_check_entries(struct kvm_cpuid_entry2 __user *entries, > __u32 num_entries, unsigned int ioctl_type)