The patch titled Subject: thp: keep huge zero page pinned until tlb flush has been added to the -mm tree. Its filename is thp-keep-huge-zero-page-pinned-until-tlb-flush.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/thp-keep-huge-zero-page-pinned-until-tlb-flush.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/thp-keep-huge-zero-page-pinned-until-tlb-flush.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> Subject: thp: keep huge zero page pinned until tlb flush Andrea has found[1] a race condition on MMU-gather based TLB flush vs split_huge_page() or shrinker which frees huge zero under us (patch 1/2 and 2/2 respectively). With new THP refcounting, we don't need patch 1/2: mmu_gather keeps the page pinned until flush is complete and the pin prevents the page from being split under us. We still need patch 2/2. This is simplified version of Andrea's patch. We don't need fancy encoding. [1] http://lkml.kernel.org/r/1447938052-22165-1-git-send-email-aarcange@xxxxxxxxxx Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Reported-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/huge_mm.h | 5 +++++ mm/huge_memory.c | 6 +++--- mm/swap.c | 5 +++++ 3 files changed, 13 insertions(+), 3 deletions(-) diff -puN include/linux/huge_mm.h~thp-keep-huge-zero-page-pinned-until-tlb-flush include/linux/huge_mm.h --- a/include/linux/huge_mm.h~thp-keep-huge-zero-page-pinned-until-tlb-flush +++ a/include/linux/huge_mm.h @@ -152,6 +152,7 @@ static inline bool is_huge_zero_pmd(pmd_ } struct page *get_huge_zero_page(void); +void put_huge_zero_page(void); #else /* CONFIG_TRANSPARENT_HUGEPAGE */ #define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; }) @@ -208,6 +209,10 @@ static inline bool is_huge_zero_page(str return false; } +static inline void put_huge_zero_page(void) +{ + BUILD_BUG(); +} static inline struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmd, int flags) diff -puN mm/huge_memory.c~thp-keep-huge-zero-page-pinned-until-tlb-flush mm/huge_memory.c --- a/mm/huge_memory.c~thp-keep-huge-zero-page-pinned-until-tlb-flush +++ a/mm/huge_memory.c @@ -232,7 +232,7 @@ retry: return READ_ONCE(huge_zero_page); } -static void put_huge_zero_page(void) +void put_huge_zero_page(void) { /* * Counter should never go to zero here. Only shrinker can put @@ -1684,12 +1684,12 @@ int zap_huge_pmd(struct mmu_gather *tlb, if (vma_is_dax(vma)) { spin_unlock(ptl); if (is_huge_zero_pmd(orig_pmd)) - put_huge_zero_page(); + tlb_remove_page(tlb, pmd_page(orig_pmd)); } else if (is_huge_zero_pmd(orig_pmd)) { pte_free(tlb->mm, pgtable_trans_huge_withdraw(tlb->mm, pmd)); atomic_long_dec(&tlb->mm->nr_ptes); spin_unlock(ptl); - put_huge_zero_page(); + tlb_remove_page(tlb, pmd_page(orig_pmd)); } else { struct page *page = pmd_page(orig_pmd); page_remove_rmap(page, true); diff -puN mm/swap.c~thp-keep-huge-zero-page-pinned-until-tlb-flush mm/swap.c --- a/mm/swap.c~thp-keep-huge-zero-page-pinned-until-tlb-flush +++ a/mm/swap.c @@ -728,6 +728,11 @@ void release_pages(struct page **pages, zone = NULL; } + if (is_huge_zero_page(page)) { + put_huge_zero_page(); + continue; + } + page = compound_head(page); if (!put_page_testzero(page)) continue; _ Patches currently in -mm which might be from kirill.shutemov@xxxxxxxxxxxxxxx are thp-keep-huge-zero-page-pinned-until-tlb-flush.patch mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix.patch mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix-2.patch mm-make-swapin-readahead-to-improve-thp-collapse-rate-fix-3.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html