Le 21/03/2025 à 14:06, Alexandre Ghiti a écrit :
This patchset intends to merge the contiguous ptes hugetlbfs implementation
of arm64 and riscv.
Can we also add powerpc in the dance ?
powerpc also use contiguous PTEs allthough there is not (yet) a special
name for it:
- b250c8c08c79 powerpc/8xx: Manage 512k huge pages as standard pages
- e47168f3d1b1 powerpc/8xx: Support 16k hugepages with 4k pages
powerpc also use configuous PMDs/PUDs for larger hugepages:
- 57fb15c32f4f ("powerpc/64s: use contiguous PMD/PUD instead of HUGEPD")
- 7c44202e3609 ("powerpc/e500: use contiguous PMD instead of hugepd")
- 0549e7666373 ("powerpc/8xx: rework support for 8M pages using
contiguous PTE entries")
Christophe
Both arm64 and riscv support the use of contiguous ptes to map pages that
are larger than the default page table size, respectively called contpte
and svnapot.
The riscv implementation differs from the arm64's in that the LSBs of the
pfn of a svnapot pte are used to store the size of the mapping, allowing
for future sizes to be added (for now only 64KB is supported). That's an
issue for the core mm code which expects to find the *real* pfn a pte points
to. Patch 1 fixes that by always returning svnapot ptes with the real pfn
and restores the size of the mapping when it is written to a page table.
The following patches are just merges of the 2 different implementations
that currently exist in arm64 and riscv which are very similar. It paves
the way to the reuse of the recent contpte THP work by Ryan [1] to avoid
reimplementing the same in riscv.
This patchset was tested by running the libhugetlbfs testsuite with 64KB
and 2MB pages on both architectures (on a 4KB base page size arm64 kernel).
[1] https://lore.kernel.org/linux-arm-kernel/20240215103205.2607016-1-ryan.roberts@xxxxxxx/
v4: https://lore.kernel.org/linux-riscv/20250127093530.19548-1-alexghiti@xxxxxxxxxxxx/
v3: https://lore.kernel.org/all/20240802151430.99114-1-alexghiti@xxxxxxxxxxxx/
v2: https://lore.kernel.org/linux-riscv/20240508113419.18620-1-alexghiti@xxxxxxxxxxxx/
v1: https://lore.kernel.org/linux-riscv/20240301091455.246686-1-alexghiti@xxxxxxxxxxxx/
Changes in v5:
- Fix "int i" unused variable in patch 2 (as reported by PW)
- Fix !svnapot build
- Fix arch_make_huge_pte() which returned a real napot pte
- Make __ptep_get(), ptep_get_and_clear() and __set_ptes() napot aware to
avoid leaking real napot pfns to core mm
- Fix arch_contpte_get_num_contig() that used to always try to get the
mapping size from the ptep, which does not work if the ptep comes the core mm
- Rebase on top of 6.14-rc7 + fix for
huge_ptep_get_and_clear()/huge_pte_clear()
https://lore.kernel.org/linux-riscv/20250317072551.572169-1-alexghiti@xxxxxxxxxxxx/
Changes in v4:
- Rebase on top of 6.13
Changes in v3:
- Split set_ptes and ptep_get into internal and external API (Ryan)
- Rename ARCH_HAS_CONTPTE into ARCH_WANT_GENERAL_HUGETLB_CONTPTE so that
we split hugetlb functions from contpte functions (actually riscv contpte
functions to support THP will come into another series) (Ryan)
- Rebase on top of 6.11-rc1
Changes in v2:
- Rebase on top of 6.9-rc3
Alexandre Ghiti (9):
riscv: Safely remove huge_pte_offset() when manipulating NAPOT ptes
riscv: Restore the pfn in a NAPOT pte when manipulated by core mm code
mm: Use common huge_ptep_get() function for riscv/arm64
mm: Use common set_huge_pte_at() function for riscv/arm64
mm: Use common huge_pte_clear() function for riscv/arm64
mm: Use common huge_ptep_get_and_clear() function for riscv/arm64
mm: Use common huge_ptep_set_access_flags() function for riscv/arm64
mm: Use common huge_ptep_set_wrprotect() function for riscv/arm64
mm: Use common huge_ptep_clear_flush() function for riscv/arm64
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/hugetlb.h | 22 +--
arch/arm64/include/asm/pgtable.h | 68 ++++++-
arch/arm64/mm/hugetlbpage.c | 294 +---------------------------
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/hugetlb.h | 36 +---
arch/riscv/include/asm/pgtable-64.h | 11 ++
arch/riscv/include/asm/pgtable.h | 222 ++++++++++++++++++---
arch/riscv/mm/hugetlbpage.c | 243 +----------------------
arch/riscv/mm/pgtable.c | 6 +-
include/linux/hugetlb_contpte.h | 39 ++++
mm/Kconfig | 3 +
mm/Makefile | 1 +
mm/hugetlb_contpte.c | 258 ++++++++++++++++++++++++
14 files changed, 583 insertions(+), 622 deletions(-)
create mode 100644 include/linux/hugetlb_contpte.h
create mode 100644 mm/hugetlb_contpte.c