On 09/21/22 16:34, Liu Shixin wrote: > The vma_lock and hugetlb_fault_mutex are dropped before handling > userfault and reacquire them again after handle_userfault(), but > reacquire the vma_lock could lead to UAF[1] due to the following > race, > > hugetlb_fault > hugetlb_no_page > /*unlock vma_lock */ > hugetlb_handle_userfault > handle_userfault > /* unlock mm->mmap_lock*/ > vm_mmap_pgoff > do_mmap > mmap_region > munmap_vma_range > /* clean old vma */ > /* lock vma_lock again <--- UAF */ > /* unlock vma_lock */ > > Since the vma_lock will unlock immediately after hugetlb_handle_userfault(), > let's drop the unneeded lock and unlock in hugetlb_handle_userfault() to fix > the issue. Thank you very much! When I saw this report, the obvious fix was to do something like what you have done below. That looks fine with a few minor comments. One question I have not yet answered is, "Does this same issue apply to follow_hugetlb_page()?". I believe it does. follow_hugetlb_page calls hugetlb_fault which could result in the fault being processed by userfaultfd. If we experience the race above, then the associated vma could no longer be valid when returning from hugetlb_fault. follow_hugetlb_page and callers have a flag (locked) to deal with dropping mmap lock. However, I am not sure if it is handled correctly WRT userfaultfd. I think this needs to be answered before fixing. And, if the follow_hugetlb_page code needs to be fixed it should be done at the same time. > [1] https://lore.kernel.org/linux-mm/20220921014457.1668-1-liuzixian4@xxxxxxxxxx/ > Reported-by: Liu Zixian <liuzixian4@xxxxxxxxxx> Perhaps reported by should be, Reported-by: syzbot+193f9cee8638750b23cf@xxxxxxxxxxxxxxxxxxxxxxxxx https://lore.kernel.org/linux-mm/000000000000d5e00a05e834962e@xxxxxxxxxx/ Should also add, Fixes: 1a1aad8a9b7b ("userfaultfd: hugetlbfs: add userfaultfd hugetlb hook") as well as, Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Liu Shixin <liushixin2@xxxxxxxxxx> > Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> > --- > mm/hugetlb.c | 30 +++++++++++------------------- > 1 file changed, 11 insertions(+), 19 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 9b8526d27c29..5a5d466692cf 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c ... > @@ -5792,11 +5786,9 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, > > entry = huge_ptep_get(ptep); > /* PTE markers should be handled the same way as none pte */ > - if (huge_pte_none_mostly(entry)) { > - ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, > + if (huge_pte_none_mostly(entry)) We should add a big comment noting that hugetlb_no_page will drop vma lock and hugetl fault mutex. This will make it easier for people reading the code and immediately thinking we are returning without dropping the locks. -- Mike Kravetz > + return hugetlb_no_page(mm, vma, mapping, idx, address, ptep, > entry, flags); > - goto out_mutex; > - } > > ret = 0; > > -- > 2.25.1 >