Hi all, During Peter's talk at the LSFMM, it was agreed that one of the things that need to be done in order to further integrate hugetlb into mm core, is to unify generic and hugetlb pagewalkers. I started with this one, which is unifying hugetlb into generic pagewalk, instead of having its hugetlb_entry entries. Which means that pmd_entry/pte_entry(for cont-pte) entries will also deal with hugetlb vmas as well, and so will new pud_entry entries since hugetlb can be pud mapped (devm pages as well but we seem not to care about those with the exception of hmm code). The outcome is this RFC. Before you continue, let me clarify certain points: This patchset is not yet finished, as there are things that 1) need more thought, 2) are still broken (like the hmm bits as I am clueless about that) 3) some paths have not been tested at all. The things I tested were: - memory-failure - smaps/numa_maps/pagemap (the latter only for pud/pmd, not cont-{pmd,ptes} - mempolicy on arm64 (for 64KB and 32M hugetlb pages) and on x86_64 (for 2MB and 1GB hugetlb pages). More tests need to be conducted, and I plan to borrow a pp64le machine to also carry out some tests there, but for now this is what my bandwith allowed me to do. I am well aware that there are two things that might scare people, one being the number of patches, and the other being the amount of code added. For the former, I will by no means ask anyone to review 45 patches, but since this patchset touches isolated paths (damon, mincore, hmm, task_mmu, memory-failure, mempolicy), I will point out some people that might be able to help me out with those different bits: - Miaohe for memory-failure bits - David for task_mmu bits - SeongJae Park for damon bits - Jerome for hmm bits - feel freel to join for the rest I think that that might be a good approach, and instead of having to review 45 patches, one has only to review at most 5 or 6. For the latter, there is an explanation: hugetlb operates on ptes (although it allocates puds/pmds and the operations work on that level too), which means that now that we will handle PUD/PMD-mapped hugetlb with {pud,pmd}_* operations, we need to introduce quite a few functions that do not exist yet and we need from now onwards. Although I am sending this out, this is not a "rfc ready material", as I said there are still things that need to be improved/fixed/tested, but I wanted to make it public nevertheless so we can gather some constructive feedback that helps us moving in the right direction and to also widen the discussions. So take this more of a "Hey, let me show what I am doing and call me out on things you consider wrong". Thanks in advance Oscar Salvador (45): arch/x86: Drop own definition of pgd,p4d_leaf mm: Add {pmd,pud}_huge_lock helper mm/pagewalk: Move vma_pgtable_walk_begin and vma_pgtable_walk_end upfront mm/pagewalk: Only call pud_entry when we have a pud leaf mm/pagewalk: Enable walk_pmd_range to handle cont-pmds mm/pagewalk: Do not try to split non-thp pud or pmd leafs arch/s390: Enable __s390_enable_skey_pmd to handle hugetlb vmas fs/proc: Enable smaps_pmd_entry to handle PMD-mapped hugetlb vmas mm: Implement pud-version functions for swap and vm_normal_page_pud fs/proc: Create smaps_pud_range to handle PUD-mapped hugetlb vmas fs/proc: Enable smaps_pte_entry to handle cont-pte mapped hugetlb vmas fs/proc: Enable pagemap_pmd_range to handle hugetlb vmas mm: Implement pud-version uffd functions fs/proc: Create pagemap_pud_range to handle PUD-mapped hugetlb vmas fs/proc: Adjust pte_to_pagemap_entry for hugetlb vmas fs/proc: Enable pagemap_scan_pmd_entry to handle hugetlb vmas mm: Implement pud-version for pud_mkinvalid and pudp_establish fs/proc: Create pagemap_scan_pud_entry to handle PUD-mapped hugetlb vmas fs/proc: Enable gather_pte_stats to handle hugetlb vmas fs/proc: Enable gather_pte_stats to handle cont-pte mapped hugetlb vmas fs/proc: Create gather_pud_stats to handle PUD-mapped hugetlb pages mm/mempolicy: Enable queue_folios_pmd to handle hugetlb vmas mm/mempolicy: Create queue_folios_pud to handle PUD-mapped hugetlb vmas mm/memory_failure: Enable check_hwpoisoned_pmd_entry to handle hugetlb vmas mm/memory-failure: Create check_hwpoisoned_pud_entry to handle PUD-mapped hugetlb vmas mm/damon: Enable damon_young_pmd_entry to handle hugetlb vmas mm/damon: Create damon_young_pud_entry to handle PUD-mapped hugetlb vmas mm/damon: Enable damon_mkold_pmd_entry to handle hugetlb vmas mm/damon: Create damon_mkold_pud_entry to handle PUD-mapped hugetlb vmas mm,mincore: Enable mincore_pte_range to handle hugetlb vmas mm/mincore: Create mincore_pud_range to handle PUD-mapped hugetlb vmas mm/hmm: Enable hmm_vma_walk_pmd, to handle hugetlb vmas mm/hmm: Enable hmm_vma_walk_pud to handle PUD-mapped hugetlb vmas arch/powerpc: Skip hugetlb vmas in subpage_mark_vma_nohuge arch/s390: Skip hugetlb vmas in thp_split_mm fs/proc: Make clear_refs_test_walk skip hugetlb vmas mm/lock: Make mlock_test_walk skip hugetlb vmas mm/madvise: Make swapin_test_walk skip hugetlb vmas mm/madvise: Make madvise_cold_test_walk skip hugetlb vmas mm/madvise: Make madvise_free_test_walk skip hugetlb vmas mm/migrate_device: Make migrate_vma_test_walk skip hugetlb vmas mm/memcontrol: Make mem_cgroup_move_test_walk skip hugetlb vmas mm/memcontrol: Make mem_cgroup_count_test_walk skip hugetlb vmas mm/hugetlb_vmemmap: Make vmemmap_test_walk skip hugetlb vmas mm: Delete all hugetlb_entry entries arch/arm64/include/asm/pgtable.h | 19 + arch/loongarch/include/asm/pgtable.h | 8 + arch/mips/include/asm/pgtable.h | 7 + arch/powerpc/include/asm/book3s/64/pgtable.h | 8 +- arch/powerpc/mm/book3s64/pgtable.c | 15 +- arch/powerpc/mm/book3s64/subpage_prot.c | 2 + arch/riscv/include/asm/pgtable.h | 15 + arch/s390/mm/gmap.c | 37 +- arch/x86/include/asm/pgtable.h | 199 +++++---- fs/proc/task_mmu.c | 434 ++++++++++++------- include/asm-generic/pgtable_uffd.h | 30 ++ include/linux/mm.h | 4 + include/linux/mm_inline.h | 34 ++ include/linux/pagewalk.h | 10 - include/linux/pgtable.h | 77 +++- include/linux/swapops.h | 27 ++ mm/damon/ops-common.c | 21 +- mm/damon/vaddr.c | 173 ++++---- mm/hmm.c | 69 +-- mm/hugetlb_vmemmap.c | 12 + mm/madvise.c | 36 ++ mm/memcontrol-v1.c | 24 + mm/memory-failure.c | 99 +++-- mm/memory.c | 51 +++ mm/mempolicy.c | 121 +++--- mm/migrate_device.c | 12 + mm/mincore.c | 46 +- mm/mlock.c | 12 + mm/mprotect.c | 10 - mm/pagewalk.c | 73 +--- mm/pgtable-generic.c | 21 + 31 files changed, 1089 insertions(+), 617 deletions(-) -- 2.26.2