On Thu, Jul 7, 2022 at 2:31 PM Kalra, Ashish <Ashish.Kalra@xxxxxxx> wrote: > > [AMD Official Use Only - General] > > Hello Peter, > > >> >There is a perf cost to this suggestion but it might make accessing > >> >the GHCB safer for KVM. Have you thought about just using > >> >kvm_read_guest() or copy_from_user() to fully copy out the GCHB into a KVM owned buffer, then copying it back before the VMRUN. That way the KVM doesn't need to guard against page_state_changes on the GHCBs, that could be a perf ?>>improvement in a follow up. > >> > >> Along with the performance costs you mentioned, the main concern here > >> will be the GHCB write-back path (copying it back) before VMRUN: this > >> will again hit the issue we have currently with > >> kvm_write_guest() / copy_to_user(), when we use it to sync the scratch > >> buffer back to GHCB. This can fail if guest RAM is mapped using huge-page(s) and RMP is 4K. Please refer to the patch/fix mentioned below, kvm_write_guest() potentially can fail before VMRUN in case of SNP : > >> > >> commit 94ed878c2669532ebae8eb9b4503f19aa33cd7aa > >> Author: Ashish Kalra <ashish.kalra@xxxxxxx> > >> Date: Mon Jun 6 22:28:01 2022 +0000 > >> > >> KVM: SVM: Sync the GHCB scratch buffer using already mapped ghcb > >> > >> Using kvm_write_guest() to sync the GHCB scratch buffer can fail > >> due to host mapping being 2M, but RMP being 4K. The page fault handling > >> in do_user_addr_fault() fails to split the 2M page to handle RMP fault due > >> to it being called here in a non-preemptible context. Instead use > >> the already kernel mapped ghcb to sync the scratch buffer when the > >> scratch buffer is contained within the GHCB. > > >Ah I didn't see that issue thanks for the pointer. > > >The patch description says "When SEV-SNP is enabled the mapped GPA needs to be protected against a page state change." since if the guest were to convert the GHCB page to private when the host is using the GHCB the host could get an RMP violation right? > > Right. > > >That RMP violation would cause the host to crash unless we use some copy_to_user() type protections. > > As such copy_to_user() will only swallow the RMP violation and return failure, so the host can retry the write. > > > I don't see anything mechanism for this patch to add the page state change protection discussed. Can't another vCPU still convert the GHCB to private? > > We do have the protections for GHCB getting mapped to private specifically, there are new post_{map|unmap}_gfn functions added to verify if it is safe to map > GHCB pages. There is a PSC spinlock added which protects again page state change for these mapped pages. > Below is the reference to this patch: > https://lore.kernel.org/lkml/cover.1655761627.git.ashish.kalra@xxxxxxx/T/#mafcaac7296eb9a92c0ea58730dbd3ca47a8e0756 > > But do note that there is protection only for GHCB pages and there is a need to add generic post_{map,unmap}_gfn() ops that can be used to verify > that it's safe to map a given guest page in the hypervisor. This is a TODO right now and probably this is something which UPM can address more cleanly. Thank you Ashish. I had missed that. Can you help me understand why its OK to use kvm_write_guest() for the |snp_certs_data| inside of snp_handle_ext_guest_request() in patch 42/49? I would have thought we'd have the same 2M vs 4K mapping issues. > > >I was wrong about the importance of this though seanjc@ walked me through how UPM will solve this issue so no worries about this until the series is rebased on to UPM. > > Thanks, > Ashish