The patch titled Subject: mm: hugetlb: considering PMD sharing when flushing cache/TLBs has been added to the -mm tree. Its filename is mm-hugetlb-considering-pmd-sharing-when-flushing-cache-tlbs.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-considering-pmd-sharing-when-flushing-cache-tlbs.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-considering-pmd-sharing-when-flushing-cache-tlbs.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> Subject: mm: hugetlb: considering PMD sharing when flushing cache/TLBs This patchset fixes some cache flushing issues if PMD sharing is possible for hugetlb pages, which were found by code inspection. Meanwhile Mike found the flush_cache_page() can not cover the whole size of a hugetlb page on some architectures [1], so I added a new patch 3 to fix this issue, since I found only try_to_unmap_one() and try_to_migrate_one() need to fix after some investigation. [1] https://lore.kernel.org/linux-mm/064da3bb-5b4b-7332-a722-c5a541128705@xxxxxxxxxx/ This patch (of 3): When moving hugetlb page tables, the cache flushing is called in move_page_tables() without considering the shared PMDs, which may be cause cache issues on some architectures. Thus we should move the hugetlb cache flushing into move_hugetlb_page_tables() with considering the shared PMDs ranges, calculated by adjust_range_if_pmd_sharing_possible(). Meanwhile also expanding the TLBs flushing range in case of shared PMDs. Note this is discovered via code inspection, and did not meet a real problem in practice so far. Link: https://lkml.kernel.org/r/cover.1651056365.git.baolin.wang@xxxxxxxxxxxxxxxxx Link: https://lkml.kernel.org/r/0443c8cf20db554d3ff4b439b30e0ff26c0181dd.1651056365.git.baolin.wang@xxxxxxxxxxxxxxxxx Fixes: 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed vma") Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: Mina Almasry <almasrymina@xxxxxxxxxx> Cc: Muchun Song <songmuchun@xxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/hugetlb.c | 17 +++++++++++++++-- mm/mremap.c | 2 +- 2 files changed, 16 insertions(+), 3 deletions(-) --- a/mm/hugetlb.c~mm-hugetlb-considering-pmd-sharing-when-flushing-cache-tlbs +++ a/mm/hugetlb.c @@ -4935,10 +4935,17 @@ int move_hugetlb_page_tables(struct vm_a unsigned long old_addr_copy; pte_t *src_pte, *dst_pte; struct mmu_notifier_range range; + bool shared_pmd = false; mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, old_addr, old_end); adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); + /* + * In case of shared PMDs, we should cover the maximum possible + * range. + */ + flush_cache_range(vma, range.start, range.end); + mmu_notifier_invalidate_range_start(&range); /* Prevent race with file truncation */ i_mmap_lock_write(mapping); @@ -4955,8 +4962,10 @@ int move_hugetlb_page_tables(struct vm_a */ old_addr_copy = old_addr; - if (huge_pmd_unshare(mm, vma, &old_addr_copy, src_pte)) + if (huge_pmd_unshare(mm, vma, &old_addr_copy, src_pte)) { + shared_pmd = true; continue; + } dst_pte = huge_pte_alloc(mm, new_vma, new_addr, sz); if (!dst_pte) @@ -4964,7 +4973,11 @@ int move_hugetlb_page_tables(struct vm_a move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte); } - flush_tlb_range(vma, old_end - len, old_end); + + if (shared_pmd) + flush_tlb_range(vma, range.start, range.end); + else + flush_tlb_range(vma, old_end - len, old_end); mmu_notifier_invalidate_range_end(&range); i_mmap_unlock_write(mapping); --- a/mm/mremap.c~mm-hugetlb-considering-pmd-sharing-when-flushing-cache-tlbs +++ a/mm/mremap.c @@ -490,12 +490,12 @@ unsigned long move_page_tables(struct vm return 0; old_end = old_addr + len; - flush_cache_range(vma, old_addr, old_end); if (is_vm_hugetlb_page(vma)) return move_hugetlb_page_tables(vma, new_vma, old_addr, new_addr, len); + flush_cache_range(vma, old_addr, old_end); mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm, old_addr, old_end); mmu_notifier_invalidate_range_start(&range); _ Patches currently in -mm which might be from baolin.wang@xxxxxxxxxxxxxxxxx are mm-migrate-simplify-the-refcount-validation-when-migrating-hugetlb-mapping.patch mm-hugetlb-add-missing-cache-flushing-in-hugetlb_unshare_all_pmds.patch mm-hugetlb-considering-pmd-sharing-when-flushing-cache-tlbs.patch mm-rmap-move-the-cache-flushing-to-the-correct-place-for-hugetlb-pmd-sharing.patch mm-rmap-use-flush_cache_range-to-flush-cache-for-hugetlb-pages.patch