On Fri, Oct 21, 2022 at 3:27 PM Tom Lendacky <thomas.lendacky@xxxxxxx> wrote: > > On 10/21/22 15:57, Peter Gonda wrote: > > On Fri, Oct 21, 2022 at 1:02 PM Tom Lendacky <thomas.lendacky@xxxxxxx> wrote: > >> On 10/21/22 12:33, Peter Gonda wrote: > >>> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to > >>> communicate securely with each other. The IV to this scheme is a > >>> sequence number that both the ASP and the guest track. Currently this > >>> sequence number in a guest request must exactly match the sequence > >>> number tracked by the ASP. This means that if the guest sees an error > >>> from the host during a request it can only retry that exact request or > >>> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV > >>> reuse see: > >>> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf > >>> > >>> To handle userspace querying the cert_data length. Instead of requesting > >>> the cert length from userspace use the size of the drivers allocated > >>> shared buffer. Then copy that buffer to userspace, or give userspace an > >>> error depending on the size of the buffer given by userspace. > >>> > >>> Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver") > >>> Signed-off-by: Peter Gonda <pgonda@xxxxxxxxxx> > >>> Reported-by: Peter Gonda <pgonda@xxxxxxxxxx> > >>> Reviewed-by: Dionna Glaze <dionnaglaze@xxxxxxxxxx> > >>> Cc: Borislav Petkov <bp@xxxxxxx> > >>> Cc: Tom Lendacky <thomas.lendacky@xxxxxxx> > >>> Cc: Michael Roth <michael.roth@xxxxxxx> > >>> Cc: Haowen Bai <baihaowen@xxxxxxxxx> > >>> Cc: Yang Yingliang <yangyingliang@xxxxxxxxxx> > >>> Cc: Marc Orr <marcorr@xxxxxxxxxx> > >>> Cc: David Rientjes <rientjes@xxxxxxxxxx> > >>> Cc: Ashish Kalra <Ashish.Kalra@xxxxxxx> > >>> Cc: linux-kernel@xxxxxxxxxxxxxxx > >>> Cc: kvm@xxxxxxxxxxxxxxx > >>> --- > > >>> @@ -477,25 +496,37 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques > >>> if (!resp) > >>> return -ENOMEM; > >>> > >>> - snp_dev->input.data_npages = npages; > >>> + snp_dev->input.data_npages = sizeof(*snp_dev->certs_data) >> PAGE_SHIFT; > >>> ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version, > >>> SNP_MSG_REPORT_REQ, &req.data, > >>> sizeof(req.data), resp->data, resp_len, &arg->fw_err); > >>> > >>> + resp_cert_len = snp_dev->input.data_npages << PAGE_SHIFT; > >> > >> The hypervisor is not required to update the number of pages that the > >> certificates actually used/required if enough pages were supplied. So this > >> value could always remain as 4 (based on SEV_FW_BLOB_MAX_SIZE) on > >> successful return. > >> > >> And if that's the case, we could always just return 4 for the number of > >> pages no matter what. Otherwise you'll have to update the logic here if > >> you want to obtain the actual number. > > > > Are you asking for this to just hard code the userspace requirement to > > 4 pages? We could leave this as written here, that would leave the > > guest open to a new GHCB spec where > > It's up to you. Ideally, if userspace provides a npages value of 0, then > the driver issues the request with 0 and gets back the actual value. Then, > to ensure the sequence numbers are updated, you issue the request again > with the either the just returned value or SEV_FW_BLOB_MAX_SIZE >> > PAGE_SHIFT. That will update the sequence numbers and the driver returns > the number of pages required as returned from the first request. > > That number can also be cached and then whenever userspace calls down with > npages of 0, immediately return the cached value. If the request ever gets > a SNP_GUEST_REQ_INVALID_LEN with the cached value, the newly returned > value gets cached and the driver performs the request again, like the very > first time in order to update the sequence numbers. The driver would then > return the new npages value back to userspace. Of course that becomes the > "minimum" number of pages now, so even if the hypervisor reduces the size > of the certs data, it always requires more than needed. > > > > > "State from Hypervisor: on error will contain the number of guest > > contiguous pages required to hold the data to be returned" > > > > Is instead: > > > > "State from Hypervisor: contain the number of guest contiguous pages > > required to hold the data to be returned" > > If the driver always returns 4, I don't see this as requiring any change > to the spec. It may be more than is truly needed, but that doesn't matter, > the cert data will fit, it just may be more than necessary. This can occur > even if you pass back the actual number, if, in between the call with 0, > the hypervisor updates the certs such that less pages are required. > > > > > I think this would be a non-breaking change since the spec says > > nothing of the non-error case currently. Fine with your preference > > here. Either Dionna or I can follow up with a series to allow more > > than 4pages if needed. > > I'd prefer that userspace get the actual number, but really, I don't think > it's a big deal to just return 4 until the driver can handle a more > dynamic approach should more than 4 pages ever be needed (this would also > require support on the hypervisor where currently not more than 4 pages > are allowed to be provided, too). > > I just wanted you to be aware that in testing you're likely to see 4 > always returned to userspace. So if you want userspace to get the actual number I think we want the host to tell us the actual number. Currently userspace only gets a upper bound since these requests race against host changes: 0. Host updates cert_data to be 10pages 1. Guest requests SNP EXT REQ for len query 2. Host returns 10 pages 3. Host updates cert_data to be 2pages 4. Guest requests SNP EXT REQ with 10page buffer 5. Host returns certs I have sent a V3 series. I left this patch largely the same but added a second patch to fix up the extended request retrying. > > > > > The logic required would be parsing the GUID table? I think we'd > > rather keep that out of the kernel driver, right? > > No, that's not the logic I'm thinking of. I'm just thinking of using the > userspace specified npages and acting accordingly. > > Thanks, > Tom > > >