On Thu, 28 Nov 2019, Bharata B Rao wrote: > On Mon, Nov 25, 2019 at 08:36:24AM +0530, Bharata B Rao wrote: > > Hi, > > > > This is the next version of the patchset that adds required support > > in the KVM hypervisor to run secure guests on PEF-enabled POWER platforms. > > > > Here is a fix for the issue Hugh identified with the usage of ksm_madvise() > in this patchset. It applies on top of this patchset. It looks correct to me, and I hope will not spoil your performance in any way that matters. But I have to say, the patch would be so much clearer, if you just named your bool "downgraded" instead of "downgrade". Hugh > ---- > > From 8a4d769bf4c61f921c79ce68923be3c403bd5862 Mon Sep 17 00:00:00 2001 > From: Bharata B Rao <bharata@xxxxxxxxxxxxx> > Date: Thu, 28 Nov 2019 09:31:54 +0530 > Subject: [PATCH 1/1] KVM: PPC: Book3S HV: Take write mmap_sem when calling > ksm_madvise > > In order to prevent the device private pages (that correspond to > pages of secure guest) from participating in KSM merging, H_SVM_PAGE_IN > calls ksm_madvise() under read version of mmap_sem. However ksm_madvise() > needs to be under write lock, fix this. > > Signed-off-by: Bharata B Rao <bharata@xxxxxxxxxxxxx> > --- > arch/powerpc/kvm/book3s_hv_uvmem.c | 29 ++++++++++++++++++++--------- > 1 file changed, 20 insertions(+), 9 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c > index f24ac3cfb34c..2de264fc3156 100644 > --- a/arch/powerpc/kvm/book3s_hv_uvmem.c > +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c > @@ -46,11 +46,10 @@ > * > * Locking order > * > - * 1. srcu_read_lock(&kvm->srcu) - Protects KVM memslots > - * 2. down_read(&kvm->mm->mmap_sem) - find_vma, migrate_vma_pages and helpers > - * 3. mutex_lock(&kvm->arch.uvmem_lock) - protects read/writes to uvmem slots > - * thus acting as sync-points > - * for page-in/out > + * 1. kvm->srcu - Protects KVM memslots > + * 2. kvm->mm->mmap_sem - find_vma, migrate_vma_pages and helpers, ksm_madvise > + * 3. kvm->arch.uvmem_lock - protects read/writes to uvmem slots thus acting > + * as sync-points for page-in/out > */ > > /* > @@ -344,7 +343,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm) > static int > kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start, > unsigned long end, unsigned long gpa, struct kvm *kvm, > - unsigned long page_shift) > + unsigned long page_shift, bool *downgrade) > { > unsigned long src_pfn, dst_pfn = 0; > struct migrate_vma mig; > @@ -360,8 +359,15 @@ kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start, > mig.src = &src_pfn; > mig.dst = &dst_pfn; > > + /* > + * We come here with mmap_sem write lock held just for > + * ksm_madvise(), otherwise we only need read mmap_sem. > + * Hence downgrade to read lock once ksm_madvise() is done. > + */ > ret = ksm_madvise(vma, vma->vm_start, vma->vm_end, > MADV_UNMERGEABLE, &vma->vm_flags); > + downgrade_write(&kvm->mm->mmap_sem); > + *downgrade = true; > if (ret) > return ret; > > @@ -456,6 +462,7 @@ unsigned long > kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa, > unsigned long flags, unsigned long page_shift) > { > + bool downgrade = false; > unsigned long start, end; > struct vm_area_struct *vma; > int srcu_idx; > @@ -476,7 +483,7 @@ kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa, > > ret = H_PARAMETER; > srcu_idx = srcu_read_lock(&kvm->srcu); > - down_read(&kvm->mm->mmap_sem); > + down_write(&kvm->mm->mmap_sem); > > start = gfn_to_hva(kvm, gfn); > if (kvm_is_error_hva(start)) > @@ -492,12 +499,16 @@ kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa, > if (!vma || vma->vm_start > start || vma->vm_end < end) > goto out_unlock; > > - if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift)) > + if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift, > + &downgrade)) > ret = H_SUCCESS; > out_unlock: > mutex_unlock(&kvm->arch.uvmem_lock); > out: > - up_read(&kvm->mm->mmap_sem); > + if (downgrade) > + up_read(&kvm->mm->mmap_sem); > + else > + up_write(&kvm->mm->mmap_sem); > srcu_read_unlock(&kvm->srcu, srcu_idx); > return ret; > } > -- > 2.21.0