On Tue, Jun 02, 2020 at 03:36:39PM +0530, Bharata B Rao wrote: > On Mon, Jun 01, 2020 at 12:05:35PM -0700, Ram Pai wrote: > > On Mon, Jun 01, 2020 at 05:25:18PM +0530, Bharata B Rao wrote: > > > On Sat, May 30, 2020 at 07:27:50PM -0700, Ram Pai wrote: > > > > H_SVM_INIT_DONE incorrectly assumes that the Ultravisor has explicitly > > > > called H_SVM_PAGE_IN for all secure pages. > > > > > > I don't think that is quite true. HV doesn't assume anything about > > > secure pages by itself. > > > > Yes. Currently, it does not assume anything about secure pages. But I am > > proposing that it should consider all pages (except the shared pages) as > > secure pages, when H_SVM_INIT_DONE is called. > > Ok, then may be also add the proposed changes to H_SVM_INIT_DONE > documentation. ok. > > > > > In other words, HV should treat all pages; except shared pages, as > > secure pages once H_SVM_INIT_DONE is called. And this includes pages > > added subsequently through memory hotplug. > > So after H_SVM_INIT_DONE, if HV touches a secure page for any > reason and gets encrypted contents via page-out, HV drops the > device pfn at that time. So what state we would be in that case? We > have completed H_SVM_INIT_DONE, but still have a normal (but encrypted) > page in HV? Good point. The corresponding GFN will continue to be a secure GFN. Just that its backing PFN is not a device-PFN, but a memory-PFN. Also that backing memory-PFN contains encrypted content. I will clarify this in the patch; about secure-GFN state. > > > > > Yes. the Ultravisor can explicitly request the HV to move the pages > > individually. But that will slow down the transition too significantly. > > It takes above 20min to transition them, for a SVM of size 100G. > > > > With this proposed enhancement, the switch completes in a few seconds. > > I think, many pages during initial switch and most pages for hotplugged > memory are zero pages, for which we don't anyway issue UV page-in calls. > So the 20min saving you are observing is purely due to hcall overhead? Apparently, that seems to be the case. > > How about extending H_SVM_PAGE_IN interface or a new hcall to request > multiple pages in one request? > > Also, how about requesting for bigger page sizes (2M)? Ralph Campbell > had patches that added THP support for migrate_vma_* calls. yes. that should give further boost. I think the API does not stop us from using that feature. Its the support on the Ultravisor side. Hopefully we will have contributions to the ultravisor once it is opensourced. > > > > > > > > > > These GFNs continue to be > > > > normal GFNs associated with normal PFNs; when infact, these GFNs should > > > > have been secure GFNs associated with device PFNs. > > > > > > Transition to secure state is driven by SVM/UV and HV just responds to > > > hcalls by issuing appropriate uvcalls. SVM/UV is in the best position to > > > determine the required pages that need to be moved into secure side. > > > HV just responds to it and tracks such pages as device private pages. > > > > > > If SVM/UV doesn't get in all the pages to secure side by the time > > > of H_SVM_INIT_DONE, the remaining pages are just normal (shared or > > > otherwise) pages as far as HV is concerned. Why should HV assume that > > > SVM/UV didn't ask for a few pages and hence push those pages during > > > H_SVM_INIT_DONE? > > > > By definition, SVM is a VM backed by secure pages. > > Hence all pages(except shared) must turn secure when a VM switches to SVM. > > > > UV is interested in only a certain pages for the VM, which it will > > request explicitly through H_SVM_PAGE_IN. All other pages, need not > > be paged-in through UV_PAGE_IN. They just need to be switched to > > device-pages. > > > > > > > > I think UV should drive the movement of pages into secure side both > > > of boot-time SVM memory and hot-plugged memory. HV does memslot > > > registration uvcall when new memory is plugged in, UV should explicitly > > > get the required pages in at that time instead of expecting HV to drive > > > the same. > > > > > > > +static int uv_migrate_mem_slot(struct kvm *kvm, > > > > + const struct kvm_memory_slot *memslot) > > > > +{ > > > > + unsigned long gfn = memslot->base_gfn; > > > > + unsigned long end; > > > > + bool downgrade = false; > > > > + struct vm_area_struct *vma; > > > > + int i, ret = 0; > > > > + unsigned long start = gfn_to_hva(kvm, gfn); > > > > + > > > > + if (kvm_is_error_hva(start)) > > > > + return H_STATE; > > > > + > > > > + end = start + (memslot->npages << PAGE_SHIFT); > > > > + > > > > + down_write(&kvm->mm->mmap_sem); > > > > + > > > > + mutex_lock(&kvm->arch.uvmem_lock); > > > > + vma = find_vma_intersection(kvm->mm, start, end); > > > > + if (!vma || vma->vm_start > start || vma->vm_end < end) { > > > > + ret = H_STATE; > > > > + goto out_unlock; > > > > + } > > > > + > > > > + ret = ksm_madvise(vma, vma->vm_start, vma->vm_end, > > > > + MADV_UNMERGEABLE, &vma->vm_flags); > > > > + downgrade_write(&kvm->mm->mmap_sem); > > > > + downgrade = true; > > > > + if (ret) { > > > > + ret = H_STATE; > > > > + goto out_unlock; > > > > + } > > > > + > > > > + for (i = 0; i < memslot->npages; i++, ++gfn) { > > > > + /* skip paged-in pages and shared pages */ > > > > + if (kvmppc_gfn_is_uvmem_pfn(gfn, kvm, NULL) || > > > > + kvmppc_gfn_is_uvmem_shared(gfn, kvm)) > > > > + continue; > > > > + > > > > + start = gfn_to_hva(kvm, gfn); > > > > + end = start + (1UL << PAGE_SHIFT); > > > > + ret = kvmppc_svm_migrate_page(vma, start, end, > > > > + (gfn << PAGE_SHIFT), kvm, PAGE_SHIFT, false); > > > > + > > > > + if (ret) > > > > + goto out_unlock; > > > > + } > > > > > > Is there a guarantee that the vma you got for the start address remains > > > valid for all the addresses till end in a memslot? If not, you should > > > re-get the vma for the current address in each iteration I suppose. > > > > > > mm->mmap_sem is the semaphore that guards the vma. right? If that > > semaphore is held, can the vma change? > > I am not sure if the vma you obtained would span the entire address range > in the memslot. will check. Thanks, RP