On Mon, Aug 21, 2023 at 12:51:20PM -0700, Hugh Dickins wrote: > Jann Horn demonstrated how userfaultfd ioctl UFFDIO_COPY into a private > shmem mapping can add valid PTEs to page table collapse_pte_mapped_thp() > thought it had emptied: page lock on the huge page is enough to protect > against WP faults (which find the PTE has been cleared), but not enough > to protect against userfaultfd. "BUG: Bad rss-counter state" followed. > > retract_page_tables() protects against this by checking !vma->anon_vma; > but we know that MADV_COLLAPSE needs to be able to work on private shmem > mappings, even those with an anon_vma prepared for another part of the > mapping; and we know that MADV_COLLAPSE needs to work on shared shmem > mappings which are userfaultfd_armed(). Whether it needs to work on > private shmem mappings which are userfaultfd_armed(), I'm not so sure: > but assume that it does. > > Just for this case, take the pmd_lock() two steps earlier: not because > it gives any protection against this case itself, but because ptlock > nests inside it, and it's the dropping of ptlock which let the bug in. > In other cases, continue to minimize the pmd_lock() hold time. > > Reported-by: Jann Horn <jannh@xxxxxxxxxx> > Closes: https://lore.kernel.org/linux-mm/CAG48ez0FxiRC4d3VTu_a9h=rg5FW-kYD5Rg5xo_RDBM0LTTqZQ@xxxxxxxxxxxxxx/ > Fixes: 1043173eb5eb ("mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()") > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> The locking is indeed slightly complicated.. but I didn't spot anything wrong. Acked-by: Peter Xu <peterx@xxxxxxxxxx> Thanks, -- Peter Xu