On 8/2/20 6:55 PM, Eric van Tassell wrote:
On 7/31/20 3:40 PM, Sean Christopherson wrote:On Fri, Jul 24, 2020 at 06:54:47PM -0500, eric van tassell wrote:Add 2 small infrastructure functions here which to enable pinning the SEVguest pages used for sev_launch_update_data() using sev_get_page().Pin the memory for the data being passed to launch_update_data() because itgets encrypted before the guest is first run and must not be moved which would corrupt it. Signed-off-by: eric van tassell <Eric.VanTassell@xxxxxxx> --- arch/x86/kvm/svm/sev.c | 48 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 040ae4aa7c5a..e0eed9a20a51 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c@@ -453,6 +453,37 @@ static int sev_get_page(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn)return 0; } +static struct kvm_memory_slot *hva_to_memslot(struct kvm *kvm, + unsigned long hva) +{ + struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memory_slot *memslot; + + kvm_for_each_memslot(memslot, slots) { + if (hva >= memslot->userspace_addr && + hva < memslot->userspace_addr + + (memslot->npages << PAGE_SHIFT)) + return memslot; + } + + return NULL; +} + +static bool hva_to_gfn(struct kvm *kvm, unsigned long hva, gfn_t *gfn) +{ + struct kvm_memory_slot *memslot; + gpa_t gpa_offset; + + memslot = hva_to_memslot(kvm, hva); + if (!memslot) + return false; + + gpa_offset = hva - memslot->userspace_addr;+ *gfn = ((memslot->base_gfn << PAGE_SHIFT) + gpa_offset) >> PAGE_SHIFT;+ + return true; +} +static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp){unsigned long vaddr, vaddr_end, next_vaddr, npages, pages, size, i; @@ -483,6 +514,23 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)goto e_free; } + /*+ * Increment the page ref count so that the pages do not get migrated or+ * moved after we are done from the LAUNCH_UPDATE_DATA. + */ + for (i = 0; i < npages; i++) { + gfn_t gfn; ++ if (!hva_to_gfn(kvm, (vaddr + (i * PAGE_SIZE)) & PAGE_MASK, &gfn)) {This needs to hold kvm->srcu to block changes to memslots while looking upthe hva->gpa translation.I'll look into this.
fixed
+ ret = -EFAULT; + goto e_unpin; + } + + ret = sev_get_page(kvm, gfn, page_to_pfn(inpages[i]));Rather than dump everything into an xarray, KVM can instead pin the pagesby faulting them into its MMU. By pinning pages in the MMU proper, KVM canuse software available bits in the SPTEs to track which pages are pinned, can assert/WARN on unexpected behavior with respect to pinned pages, andcan drop/unpin pages as soon as they are no longer reachable from KVM, e.g.when the mm_struct dies or the associated memslot is removed. Leveraging the MMU would also make this extensible to non-SEV features,e.g. it can be shared by VMX if VMX adds a feature that needs similar hooks in the MMU. Shoving the tracking in SEV means the core logic would need tobe duplicated for other features. The big caveat is that funneling this through the MMU requires a vCPU[*], i.e. is only viable if userspace has already created at least one vCPU. For QEMU, this is guaranteed. I don't know about other VMMs.If there are VMMs that support SEV and don't create vCPUs before encryptingguest memory, one option would be to automatically go down the optimizedroute iff at least one vCPU has been created. In other words, don't breakold VMMs, but don't carry more hacks to make them faster either.It just so happens that I have some code that sort of implements the above.I reworked it to mesh with SEV and will post it as an RFC. It's far from a tested-and-ready-to-roll implemenation, but I think it's fleshed out enough to start a conversation.[*] This isn't a hard requirement, i.e. KVM could be reworked to provide acommon MMU for non-nested TDP, but that's a much bigger effort.I will think about this. (I'm out of the office Monday and Tuesday.)
Brijesh invested time and could not get this approach to meet SEV's needs as he reported in a previous email.
+ if (ret) + goto e_unpin; + } + /** The LAUNCH_UPDATE command will perform in-place encryption of the * memory content (i.e it will write the same memory region with C=1).-- 2.17.1