On 10/15/22 09:25, Miaohe Lin wrote: > Sorry for late respond. It's a really busy week. :) > > On 2022/10/5 9:17, Mike Kravetz wrote: > > The hugetlb vma lock hangs off the vm_private_data field and is specific > > to the vma. When vm_area_dup() is called as part of vma splitting, the > > Oh, I checked vm_area_dup() from callsite of copy_vma and dup_mmap but split_vma > is missed... And yes, vma splitting can occur but vma merging won't for hugetlb > vma. Thanks for catching this, Mike. > > > vma lock pointer is copied to the new vma. This will result in issues > > such as double freeing of the structure. Update the hugetlb open vm_ops > > to allocate a new vma lock for the new vma. > > > > The routine __unmap_hugepage_range_final unconditionally unset > > VM_MAYSHARE to prevent subsequent pmd sharing. hugetlb_vma_lock_free > > attempted to anticipate this by checking both VM_MAYSHARE and VM_SHARED. > > However, if only VM_MAYSHARE was set we would miss the free. With the > > introduction of the vma lock, a vma can not participate in pmd sharing > > if vm_private_data is NULL. Instead of clearing VM_MAYSHARE in > > __unmap_hugepage_range_final, free the vma lock to prevent sharing. Also, > > update the sharing code to make sure vma lock is indeed a condition for > > pmd sharing. hugetlb_vma_lock_free can then key off VM_MAYSHARE and not > > miss any vmas. > > > > Fixes: "hugetlb: add vma based lock for pmd sharing" > > Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > > --- > > mm/hugetlb.c | 43 +++++++++++++++++++++++++++---------------- > > mm/memory.c | 4 ---- > > 2 files changed, 27 insertions(+), 20 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 4443e87e814b..0129d371800c 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -4612,7 +4612,14 @@ static void hugetlb_vm_op_open(struct vm_area_struct *vma) > > kref_get(&resv->refs); > > } > > > > - hugetlb_vma_lock_alloc(vma); > > + /* > > + * vma_lock structure for sharable mappings is vma specific. > > + * Clear old pointer (if copied via vm_area_dup) and create new. > > + */ > > + if (vma->vm_flags & VM_MAYSHARE) { > > + vma->vm_private_data = NULL; > > + hugetlb_vma_lock_alloc(vma); > > + } > > IMHO this would lead to memoryleak. Think about the below move_vma() flow: > move_vma > copy_vma > new_vma = vm_area_dup(vma); > new_vma->vm_ops->open(new_vma); --> new_vma has its own vma lock. > is_vm_hugetlb_page(vma) > clear_vma_resv_huge_pages > hugetlb_dup_vma_private --> vma->vm_private_data is set to NULL > without put ref. So vma lock is *leaked*? You are right, that could lead to a leak. I have an idea about setting vma->vm_private_data to NULL for VM_MAYSHARE vmas in routines like hugetlb_dup_vma_private(). We can check hugetlb_vma_lock->vma and only set to NULL if, vma->(hugetlb_vma_lock)vma->vm_private_data->vma != vma Got sidetracked chasing down another leak today. Will send a patch implementing this idea soon. Thanks for looking at this! -- Mike Kravetz