On Tue, Jul 21, 2020 at 12:42:02PM +0200, Laurent Dufour wrote: > When a secure memslot is dropped, all the pages backed in the secure device > (aka really backed by secure memory by the Ultravisor) should be paged out > to a normal page. Previously, this was achieved by triggering the page > fault mechanism which is calling kvmppc_svm_page_out() on each pages. > > This can't work when hot unplugging a memory slot because the memory slot > is flagged as invalid and gfn_to_pfn() is then not trying to access the > page, so the page fault mechanism is not triggered. > > Since the final goal is to make a call to kvmppc_svm_page_out() it seems > simpler to directly calling it instead of triggering such a mechanism. This > way kvmppc_uvmem_drop_pages() can be called even when hot unplugging a > memslot. > > Since kvmppc_uvmem_drop_pages() is already holding kvm->arch.uvmem_lock, > the call to __kvmppc_svm_page_out() is made. > As __kvmppc_svm_page_out needs the vma pointer to migrate the pages, the > VMA is fetched in a lazy way, to not trigger find_vma() all the time. In > addition, the mmap_sem is help in read mode during that time, not in write > mode since the virual memory layout is not impacted, and > kvm->arch.uvmem_lock prevents concurrent operation on the secure device. > > Cc: Ram Pai <linuxram@xxxxxxxxxx> > Cc: Bharata B Rao <bharata@xxxxxxxxxxxxx> > Cc: Paul Mackerras <paulus@xxxxxxxxxx> > Signed-off-by: Laurent Dufour <ldufour@xxxxxxxxxxxxx> > --- > arch/powerpc/kvm/book3s_hv_uvmem.c | 54 ++++++++++++++++++++---------- > 1 file changed, 37 insertions(+), 17 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c > index 5a4b02d3f651..ba5c7c77cc3a 100644 > --- a/arch/powerpc/kvm/book3s_hv_uvmem.c > +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c > @@ -624,35 +624,55 @@ static inline int kvmppc_svm_page_out(struct vm_area_struct *vma, > * fault on them, do fault time migration to replace the device PTEs in > * QEMU page table with normal PTEs from newly allocated pages. > */ > -void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free, > +void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *slot, > struct kvm *kvm, bool skip_page_out) > { > int i; > struct kvmppc_uvmem_page_pvt *pvt; > - unsigned long pfn, uvmem_pfn; > - unsigned long gfn = free->base_gfn; > + struct page *uvmem_page; > + struct vm_area_struct *vma = NULL; > + unsigned long uvmem_pfn, gfn; > + unsigned long addr, end; > + > + mmap_read_lock(kvm->mm); > + > + addr = slot->userspace_addr; We typically use gfn_to_hva() for that, but that won't work for a memslot that is already marked INVALID which is the case here. I think it is ok to access slot->userspace_addr here of an INVALID memslot, but just thought of explictly bringing this up. > + end = addr + (slot->npages * PAGE_SIZE); > > - for (i = free->npages; i; --i, ++gfn) { > - struct page *uvmem_page; > + gfn = slot->base_gfn; > + for (i = slot->npages; i; --i, ++gfn, addr += PAGE_SIZE) { > + > + /* Fetch the VMA if addr is not in the latest fetched one */ > + if (!vma || (addr < vma->vm_start || addr >= vma->vm_end)) { > + vma = find_vma_intersection(kvm->mm, addr, end); > + if (!vma || > + vma->vm_start > addr || vma->vm_end < end) { > + pr_err("Can't find VMA for gfn:0x%lx\n", gfn); > + break; > + } > + } In Ram's series, kvmppc_memslot_page_merge() also walks the VMAs spanning the memslot, but it uses a different logic for the same. Why can't these two cases use the same method to walk the VMAs? Is there anything subtly different between the two cases? Regards, Bharata.