The patch titled Subject: huegtlbfs-fix-races-and-page-leaks-during-migration-update has been added to the -mm tree. Its filename is huegtlbfs-fix-races-and-page-leaks-during-migration-update.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/huegtlbfs-fix-races-and-page-leaks-during-migration-update.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/huegtlbfs-fix-races-and-page-leaks-during-migration-update.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Subject: huegtlbfs-fix-races-and-page-leaks-during-migration-update >> spin_unlock(ptl); >> + >> + /* May already be set if not newly allocated page */ >> + set_page_huge_active(page); >> + This is wrong. We need to only set_page_huge_active() for newly allocated pages. Why? We could have got the page from the pagecache, and it could be that the page is !page_huge_active() because it has been isolated for migration. Therefore, we do not want to set it active here. I have also found another race with migration when removing a page from a file. When a huge page is removed from the pagecache, the page_mapping() field is cleared yet page_private continues to point to the subpool until the page is actually freed by free_huge_page(). free_huge_page is what adjusts the counts for the subpool. A page could be migrated while in this state. However, since page_mapping() is not set the hugetlbfs specific routine to transfer page_private is not called and we leak the page count in the filesystem. To fix, check for this condition before migrating a huge page. If the condition is detected, return EBUSY for the page. Both issues are addressed in the updated patch below. Sorry for the churn. As I find and fix one issue I seem to discover another. There is still at least one more issue with private pages when COW comes into play. I continue to work that. I wanted to send this patch earlier as it is pretty easy to hit the bugs if you try. If you would prefer another approach, let me know. Link: http://lkml.kernel.org/r/7534d322-d782-8ac6-1c8d-a8dc380eb3ab@xxxxxxxxxx Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: "Kirill A . Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- --- a/mm/hugetlb.c~huegtlbfs-fix-races-and-page-leaks-during-migration-update +++ a/mm/hugetlb.c @@ -3729,6 +3729,7 @@ static vm_fault_t hugetlb_no_page(struct pte_t new_pte; spinlock_t *ptl; unsigned long haddr = address & huge_page_mask(h); + bool new_page = false; /* * Currently, we are forced to kill the process in the event the @@ -3790,6 +3791,7 @@ retry: } clear_huge_page(page, address, pages_per_huge_page(h)); __SetPageUptodate(page); + new_page = true; if (vma->vm_flags & VM_MAYSHARE) { int err = huge_add_to_page_cache(page, mapping, idx); @@ -3861,8 +3863,9 @@ retry: spin_unlock(ptl); - /* May already be set if not newly allocated page */ - set_page_huge_active(page); + /* Make newly allocated pages active */ + if (new_page) + set_page_huge_active(page); unlock_page(page); out: --- a/mm/migrate.c~huegtlbfs-fix-races-and-page-leaks-during-migration-update +++ a/mm/migrate.c @@ -1315,6 +1315,16 @@ static int unmap_and_move_huge_page(new_ lock_page(hpage); } + /* + * Check for pages which are in the process of being freed. Without + * page_mapping() set, hugetlbfs specific move page routine will not + * be called and we could leak usage counts for subpools. + */ + if (page_private(hpage) && !page_mapping(hpage)) { + rc = -EBUSY; + goto out_unlock; + } + if (PageAnon(hpage)) anon_vma = page_get_anon_vma(hpage); @@ -1345,6 +1355,7 @@ put_anon: put_new_page = NULL; } +out_unlock: unlock_page(hpage); out: if (rc != -EAGAIN) _ Patches currently in -mm which might be from mike.kravetz@xxxxxxxxxx are huegtlbfs-fix-races-and-page-leaks-during-migration.patch huegtlbfs-fix-races-and-page-leaks-during-migration-update.patch