The patch titled Subject: arm64/mm: set_ptes()/set_pte_at(): new layer to manage contig bit has been added to the -mm mm-unstable branch. Its filename is arm64-mm-set_ptes-set_pte_at-new-layer-to-manage-contig-bit.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/arm64-mm-set_ptes-set_pte_at-new-layer-to-manage-contig-bit.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Ryan Roberts <ryan.roberts@xxxxxxx> Subject: arm64/mm: set_ptes()/set_pte_at(): new layer to manage contig bit Date: Mon, 18 Dec 2023 10:50:49 +0000 Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API. The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses. set_pte_at() is a core macro that forwards to set_ptes() (with nr=1). Instead of creating a __set_pte_at() internal macro, convert all arch users to use set_ptes()/__set_ptes() directly, as appropriate. Callers in hugetlb may benefit from calling __set_ptes() once for their whole range rather than managing their own loop. This is left for future improvement. Link: https://lkml.kernel.org/r/20231218105100.172635-6-ryan.roberts@xxxxxxx Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx> Tested-by: John Hubbard <jhubbard@xxxxxxxxxx> Cc: Alexander Potapenko <glider@xxxxxxxxxx> Cc: Alistair Popple <apopple@xxxxxxxxxx> Cc: Andrey Konovalov <andreyknvl@xxxxxxxxx> Cc: Andrey Ryabinin <ryabinin.a.a@xxxxxxxxx> Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx> Cc: Ard Biesheuvel <ardb@xxxxxxxxxx> Cc: Barry Song <21cnbao@xxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx> Cc: James Morse <james.morse@xxxxxxx> Cc: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> Cc: Marc Zyngier <maz@xxxxxxxxxx> Cc: Mark Rutland <mark.rutland@xxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Oliver Upton <oliver.upton@xxxxxxxxx> Cc: Suzuki Poulouse <suzuki.poulose@xxxxxxx> Cc: Vincenzo Frascino <vincenzo.frascino@xxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Cc: Yang Shi <shy828301@xxxxxxxxx> Cc: Yu Zhao <yuzhao@xxxxxxxxxx> Cc: Zenghui Yu <yuzenghui@xxxxxxxxxx> Cc: Zi Yan <ziy@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- arch/arm64/include/asm/pgtable.h | 10 +++++----- arch/arm64/kernel/mte.c | 2 +- arch/arm64/kvm/guest.c | 2 +- arch/arm64/mm/fault.c | 2 +- arch/arm64/mm/hugetlbpage.c | 10 +++++----- 5 files changed, 13 insertions(+), 13 deletions(-) --- a/arch/arm64/include/asm/pgtable.h~arm64-mm-set_ptes-set_pte_at-new-layer-to-manage-contig-bit +++ a/arch/arm64/include/asm/pgtable.h @@ -342,9 +342,9 @@ static inline void __sync_cache_and_tags mte_sync_tags(pte, nr_pages); } -static inline void set_ptes(struct mm_struct *mm, - unsigned long __always_unused addr, - pte_t *ptep, pte_t pte, unsigned int nr) +static inline void __set_ptes(struct mm_struct *mm, + unsigned long __always_unused addr, + pte_t *ptep, pte_t pte, unsigned int nr) { page_table_check_ptes_set(mm, ptep, pte, nr); __sync_cache_and_tags(pte, nr); @@ -358,7 +358,6 @@ static inline void set_ptes(struct mm_st pte_val(pte) += PAGE_SIZE; } } -#define set_ptes set_ptes /* * Huge pte definitions. @@ -1067,7 +1066,7 @@ static inline void arch_swap_restore(swp #endif /* CONFIG_ARM64_MTE */ /* - * On AArch64, the cache coherency is handled via the set_pte_at() function. + * On AArch64, the cache coherency is handled via the __set_ptes() function. */ static inline void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, @@ -1121,6 +1120,7 @@ extern void ptep_modify_prot_commit(stru pte_t old_pte, pte_t new_pte); #define set_pte __set_pte +#define set_ptes __set_ptes #endif /* !__ASSEMBLY__ */ --- a/arch/arm64/kernel/mte.c~arm64-mm-set_ptes-set_pte_at-new-layer-to-manage-contig-bit +++ a/arch/arm64/kernel/mte.c @@ -67,7 +67,7 @@ int memcmp_pages(struct page *page1, str /* * If the page content is identical but at least one of the pages is * tagged, return non-zero to avoid KSM merging. If only one of the - * pages is tagged, set_pte_at() may zero or change the tags of the + * pages is tagged, __set_ptes() may zero or change the tags of the * other page via mte_sync_tags(). */ if (page_mte_tagged(page1) || page_mte_tagged(page2)) --- a/arch/arm64/kvm/guest.c~arm64-mm-set_ptes-set_pte_at-new-layer-to-manage-contig-bit +++ a/arch/arm64/kvm/guest.c @@ -1072,7 +1072,7 @@ int kvm_vm_ioctl_mte_copy_tags(struct kv } else { /* * Only locking to serialise with a concurrent - * set_pte_at() in the VMM but still overriding the + * __set_ptes() in the VMM but still overriding the * tags, hence ignoring the return value. */ try_page_mte_tagging(page); --- a/arch/arm64/mm/fault.c~arm64-mm-set_ptes-set_pte_at-new-layer-to-manage-contig-bit +++ a/arch/arm64/mm/fault.c @@ -205,7 +205,7 @@ static void show_pte(unsigned long addr) * * It needs to cope with hardware update of the accessed/dirty state by other * agents in the system and can safely skip the __sync_icache_dcache() call as, - * like set_pte_at(), the PTE is never changed from no-exec to exec here. + * like __set_ptes(), the PTE is never changed from no-exec to exec here. * * Returns whether or not the PTE actually changed. */ --- a/arch/arm64/mm/hugetlbpage.c~arm64-mm-set_ptes-set_pte_at-new-layer-to-manage-contig-bit +++ a/arch/arm64/mm/hugetlbpage.c @@ -254,12 +254,12 @@ void set_huge_pte_at(struct mm_struct *m if (!pte_present(pte)) { for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) - set_pte_at(mm, addr, ptep, pte); + __set_ptes(mm, addr, ptep, pte, 1); return; } if (!pte_cont(pte)) { - set_pte_at(mm, addr, ptep, pte); + __set_ptes(mm, addr, ptep, pte, 1); return; } @@ -270,7 +270,7 @@ void set_huge_pte_at(struct mm_struct *m clear_flush(mm, addr, ptep, pgsize, ncontig); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); } pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -478,7 +478,7 @@ int huge_ptep_set_access_flags(struct vm hugeprot = pte_pgprot(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); return 1; } @@ -507,7 +507,7 @@ void huge_ptep_set_wrprotect(struct mm_s pfn = pte_pfn(pte); for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) - set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot)); + __set_ptes(mm, addr, ptep, pfn_pte(pfn, hugeprot), 1); } pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, _ Patches currently in -mm which might be from ryan.roberts@xxxxxxx are mm-allow-deferred-splitting-of-arbitrary-anon-large-folios.patch mm-non-pmd-mappable-large-folios-for-folio_add_new_anon_rmap.patch mm-thp-introduce-multi-size-thp-sysfs-interface.patch mm-thp-introduce-multi-size-thp-sysfs-interface-fix.patch mm-thp-support-allocation-of-anonymous-multi-size-thp.patch mm-thp-support-allocation-of-anonymous-multi-size-thp-fix.patch selftests-mm-kugepaged-restore-thp-settings-at-exit.patch selftests-mm-factor-out-thp-settings-management.patch selftests-mm-support-multi-size-thp-interface-in-thp_settings.patch selftests-mm-khugepaged-enlighten-for-multi-size-thp.patch selftests-mm-cow-generalize-do_run_with_thp-helper.patch selftests-mm-cow-add-tests-for-anonymous-multi-size-thp.patch mm-thp-batch-collapse-pmd-with-set_ptes.patch mm-batch-copy-pte-ranges-during-fork.patch mm-batch-clear-pte-ranges-during-zap_pte_range.patch arm64-mm-set_pte-new-layer-to-manage-contig-bit.patch arm64-mm-set_ptes-set_pte_at-new-layer-to-manage-contig-bit.patch arm64-mm-pte_clear-new-layer-to-manage-contig-bit.patch arm64-mm-ptep_get_and_clear-new-layer-to-manage-contig-bit.patch arm64-mm-ptep_test_and_clear_young-new-layer-to-manage-contig-bit.patch arm64-mm-ptep_clear_flush_young-new-layer-to-manage-contig-bit.patch arm64-mm-ptep_set_wrprotect-new-layer-to-manage-contig-bit.patch arm64-mm-ptep_set_access_flags-new-layer-to-manage-contig-bit.patch arm64-mm-ptep_get-new-layer-to-manage-contig-bit.patch arm64-mm-split-__flush_tlb_range-to-elide-trailing-dsb.patch arm64-mm-wire-up-pte_cont-for-user-mappings.patch arm64-mm-implement-new-helpers-to-optimize-fork.patch arm64-mm-implement-clear_ptes-to-optimize-exit-munmap-dontneed.patch selftests-mm-log-run_vmtestssh-results-in-tap-format.patch