Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to ensure that the page table was not removed by khugepaged in between. However, lockless_pages_from_mm() still requires that the page table is not concurrently freed. Fix it by sending IPIs (if the architecture uses semi-RCU-style page table freeing) before freeing/reusing page tables. Link: https://lkml.kernel.org/r/20221129154730.2274278-2-jannh@xxxxxxxxxx Link: https://lkml.kernel.org/r/20221128180252.1684965-2-jannh@xxxxxxxxxx Link: https://lkml.kernel.org/r/20221125213714.4115729-2-jannh@xxxxxxxxxx Fixes: ba76149f47d8 ("thp: khugepaged") Signed-off-by: Jann Horn <jannh@xxxxxxxxxx> Reviewed-by: Yang Shi <shy828301@xxxxxxxxx> Acked-by: David Hildenbrand <david@xxxxxxxxxx> Cc: John Hubbard <jhubbard@xxxxxxxxxx> Cc: Peter Xu <peterx@xxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> [manual backport: two of the three places in khugepaged that can free ptes were refactored into a common helper between 5.15 and 6.0; TLB flushing was refactored between 5.4 and 5.10; TLB flushing was refactored between 4.19 and 5.4; pmd collapse for PTE-mapped THP was only added in 5.4; ugly hack needed in <=4.19 for s390] Signed-off-by: Jann Horn <jannh@xxxxxxxxxx> --- added an incredibly ugly hack to work around the s390 issue reported in https://lore.kernel.org/all/Y5aaQqNH71IMVks0@sashalap/ include/asm-generic/tlb.h | 6 ++++++ mm/khugepaged.c | 15 +++++++++++++++ mm/memory.c | 5 +++++ 3 files changed, 26 insertions(+) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index db72ad39853b..737f5cb0dc84 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -61,6 +61,12 @@ struct mmu_table_batch { extern void tlb_table_flush(struct mmu_gather *tlb); extern void tlb_remove_table(struct mmu_gather *tlb, void *table); +void tlb_remove_table_sync_one(void); + +#else + +static inline void tlb_remove_table_sync_one(void) { } + #endif /* diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5dd14ef2e1de..0f217bb9b534 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -23,6 +23,19 @@ #include <asm/pgalloc.h> #include "internal.h" +/* gross hack for <=4.19 stable */ +#ifdef CONFIG_S390 +static void tlb_remove_table_smp_sync(void *arg) +{ + /* Simply deliver the interrupt */ +} + +static void tlb_remove_table_sync_one(void) +{ + smp_call_function(tlb_remove_table_smp_sync, NULL, 1); +} +#endif + enum scan_result { SCAN_FAIL, SCAN_SUCCEED, @@ -1045,6 +1058,7 @@ static void collapse_huge_page(struct mm_struct *mm, _pmd = pmdp_collapse_flush(vma, address, pmd); spin_unlock(pmd_ptl); mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); + tlb_remove_table_sync_one(); spin_lock(pte_ptl); isolated = __collapse_huge_page_isolate(vma, address, pte); @@ -1294,6 +1308,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) _pmd = pmdp_collapse_flush(vma, addr, pmd); spin_unlock(ptl); mm_dec_nr_ptes(mm); + tlb_remove_table_sync_one(); pte_free(mm, pmd_pgtable(_pmd)); } up_write(&mm->mmap_sem); diff --git a/mm/memory.c b/mm/memory.c index 800834cff4e6..b80ce6b3c8f4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -362,6 +362,11 @@ static void tlb_remove_table_smp_sync(void *arg) /* Simply deliver the interrupt */ } +void tlb_remove_table_sync_one(void) +{ + smp_call_function(tlb_remove_table_smp_sync, NULL, 1); +} + static void tlb_remove_table_one(void *table) { /* base-commit: c1ccef20f08e192228a2056808113b453d18c094 -- 2.39.0.rc1.256.g54fd8350bd-goog