The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@xxxxxxxxxxxxxxx>. Possible dependencies: 73bdf65ea748 ("migrate: hugetlb: check for hugetlb shared PMD in node migration") 7ce82f4c3f3e ("mm/migration: return errno when isolate_huge_page failed") 1b7f7e58decc ("mm/gup: Convert check_and_migrate_movable_pages() to use a folio") f9f38f78c5d5 ("mm: refactor check_and_migrate_movable_pages") 5ac95884a784 ("mm/migrate: enable returning precise migrate_pages() success count") c5b5a3dd2c1f ("mm: thp: refactor NUMA fault handling") 5db4f15c4fd7 ("mm: memory: add orig_pmd to struct vm_fault") 8f34f1eac382 ("mm/userfaultfd: fix uffd-wp special cases for fork()") 25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation") f68749ec342b ("mm/gup: longterm pin migration cleanup") d1e153fea2a8 ("mm/gup: migrate pinned pages out of movable zone") 1a08ae36cf8b ("mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN") 6e7f34ebb8d2 ("mm/gup: check for isolation errors") f0f4463837da ("mm/gup: return an error on migration failure") 83c02c23d074 ("mm/gup: check every subpage of a compound page during isolation") c991ffef7bce ("mm/gup: don't pin migrated cma pages in movable zone") 7ee820ee7238 ("Revert "mm: migrate: skip shared exec THP for NUMA balancing"") ae37c7ff79f1 ("mm: make alloc_contig_range handle in-use hugetlb pages") 369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages") c2ad7a1ffeaf ("mm,compaction: let isolate_migratepages_{range,block} return error codes") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 73bdf65ea74857d7fb2ec3067a3cec0e261b1462 Mon Sep 17 00:00:00 2001 From: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Date: Thu, 26 Jan 2023 14:27:21 -0800 Subject: [PATCH] migrate: hugetlb: check for hugetlb shared PMD in node migration migrate_pages/mempolicy semantics state that CAP_SYS_NICE is required to move pages shared with another process to a different node. page_mapcount > 1 is being used to determine if a hugetlb page is shared. However, a hugetlb page will have a mapcount of 1 if mapped by multiple processes via a shared PMD. As a result, hugetlb pages shared by multiple processes and mapped with a shared PMD can be moved by a process without CAP_SYS_NICE. To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found consider the page shared. Link: https://lkml.kernel.org/r/20230126222721.222195-3-mike.kravetz@xxxxxxxxxx Fixes: e2d8cf405525 ("migrate: add hugepage migration code to migrate_pages()") Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Acked-by: Peter Xu <peterx@xxxxxxxxxx> Acked-by: David Hildenbrand <david@xxxxxxxxxx> Cc: James Houghton <jthoughton@xxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Muchun Song <songmuchun@xxxxxxxxxxxxx> Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxxxx> Cc: Vishal Moola (Oracle) <vishal.moola@xxxxxxxxx> Cc: Yang Shi <shy828301@xxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 02c8a712282f..f940395667c8 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -600,7 +600,8 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask, /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */ if (flags & (MPOL_MF_MOVE_ALL) || - (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) { + (flags & MPOL_MF_MOVE && page_mapcount(page) == 1 && + !hugetlb_pmd_shared(pte))) { if (isolate_hugetlb(page, qp->pagelist) && (flags & MPOL_MF_STRICT)) /*