On Mon, May 20, 2024, Michael Roth wrote: > On Mon, May 20, 2024 at 07:17:13AM -0700, Sean Christopherson wrote: > > On Sat, May 18, 2024, Michael Roth wrote: > > > Before forwarding guest requests to firmware, KVM takes a reference on > > > the 2 pages the guest uses for its request/response buffers. Make sure > > > to release these when cleaning up after the request is completed. > > > > > > Signed-off-by: Michael Roth <michael.roth@xxxxxxx> > > > --- > > > > ... > > > > > @@ -3970,14 +3980,11 @@ static int __snp_handle_guest_req(struct kvm *kvm, gpa_t req_gpa, gpa_t resp_gpa > > > return ret; > > > > > > ret = sev_issue_cmd(kvm, SEV_CMD_SNP_GUEST_REQUEST, &data, fw_err); > > > - if (ret) > > > - return ret; > > > > > > - ret = snp_cleanup_guest_buf(&data); > > > - if (ret) > > > - return ret; > > > + if (snp_cleanup_guest_buf(&data)) > > > + return -EINVAL; > > > > EINVAL feels wrong. The input was completely valid. Also, forwarding the error > > Yah, EIO seems more suitable here. > > > to the guest doesn't seem like the right thing to do if KVM can't reclaim the > > response PFN. Shouldn't that be fatal to the VM? > > The thinking here is that pretty much all guest request failures will be > fatal to the guest being able to continue. At least, that's definitely > true for attestation. So reporting the error to the guest would allow that > failure to be propagated along by handling in the guest where it would > presumably be reported a little more clearly to the guest owner, at > which point the guest would most likely terminate itself anyway. But failure to convert a pfn back to shared is a _host_ issue, not a guest issue. E.g. it most likely indicates a bug in the host software stack, or perhaps a bad CPU or firmware bug. > But there is a possibility that the guest will attempt access the response > PFN before/during that reporting and spin on an #NPF instead though. So > maybe the safer more repeatable approach is to handle the error directly > from KVM and propagate it to userspace. I was thinking more along the lines of KVM marking the VM as dead/bugged. > But the GHCB spec does require that the firmware response code for > SNP_GUEST_REQUEST be passed directly to the guest via lower 32-bits of > SW_EXITINFO2, so we'd still want handling to pass that error on to the > guest, so I made some changes to retain that behavior. If and only the hypervisor completes the event. The hypervisor must save the SNP_GUEST_REQUEST return code in the lower 32-bits of the SW_EXITINFO2 field before completing the Guest Request NAE event. If KVM terminates the VM, there's obviously no need to fill SW_EXITINFO2. Side topic, is there a plan to ratelimit Guest Requests? To avoid the possibility of a guest creating a denial of service attack against the SNP firmware, it is recommended that some form of rate limiting be implemented should it be detected that a high number of Guest Request NAE events are being issued.