The patch titled Subject: mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick has been added to the -mm mm-unstable branch. Its filename is mm-debug_vm_pgtable-drop-random_orvalue-trick.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-debug_vm_pgtable-drop-random_orvalue-trick.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Peter Xu <peterx@xxxxxxxxxx> Subject: mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick Date: Thu, 23 May 2024 09:21:39 -0400 Macro RANDOM_ORVALUE was used to make sure the pgtable entry will be populated with !none data in clear tests. The RANDOM_ORVALUE tried to cover mostly all the bits in a pgtable entry, even if there's no discussion on whether all the bits will be vaild. Both S390 and PPC64 have their own masks to avoid touching some bits. Now it's the turn for x86_64. The issue is there's a recent report from Mikhail Gavrilov showing that this can cause a warning with the newly added pte set check in commit 8430557fc5 on writable v.s. userfaultfd-wp bit, even though the check itself was valid, the random pte is not. We can choose to mask more bits out. However the need to have such random bits setup is questionable, as now it's already guaranteed to be true on below: - For pte level, the pgtable entry will be installed with value from pfn_pte(), where pfn points to a valid page. Hence the pte will be !none already if populated with pfn_pte(). - For upper-than-pte level, the pgtable entry should contain a directory entry always, which is also !none. All the cases look like good enough to test a pxx_clear() helper. Instead of extending the bitmask, drop the "set random bits" trick completely. Add some warning guards to make sure the entries will be !none before clear(). Link: https://lkml.kernel.org/r/20240523132139.289719-1-peterx@xxxxxxxxxx Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx> Link: https://lore.kernel.org/r/CABXGCsMB9A8-X+Np_Q+fWLURYL_0t3Y-MdoNabDM-Lzk58-DGA@xxxxxxxxxxxxxx Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> Cc: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> Cc: Gavin Shan <gshan@xxxxxxxxxx> Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/debug_vm_pgtable.c | 31 +++++-------------------------- 1 file changed, 5 insertions(+), 26 deletions(-) --- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-drop-random_orvalue-trick +++ a/mm/debug_vm_pgtable.c @@ -40,22 +40,7 @@ * Please refer Documentation/mm/arch_pgtable_helpers.rst for the semantics * expectations that are being validated here. All future changes in here * or the documentation need to be in sync. - * - * On s390 platform, the lower 4 bits are used to identify given page table - * entry type. But these bits might affect the ability to clear entries with - * pxx_clear() because of how dynamic page table folding works on s390. So - * while loading up the entries do not change the lower 4 bits. It does not - * have affect any other platform. Also avoid the 62nd bit on ppc64 that is - * used to mark a pte entry. */ -#define S390_SKIP_MASK GENMASK(3, 0) -#if __BITS_PER_LONG == 64 -#define PPC64_SKIP_MASK GENMASK(62, 62) -#else -#define PPC64_SKIP_MASK 0x0 -#endif -#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK) -#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK) #define RANDOM_NZVALUE GENMASK(7, 0) struct pgtable_debug_args { @@ -511,8 +496,7 @@ static void __init pud_clear_tests(struc return; pr_debug("Validating PUD clear\n"); - pud = __pud(pud_val(pud) | RANDOM_ORVALUE); - WRITE_ONCE(*args->pudp, pud); + WARN_ON(pud_none(pud)); pud_clear(args->pudp); pud = READ_ONCE(*args->pudp); WARN_ON(!pud_none(pud)); @@ -548,8 +532,7 @@ static void __init p4d_clear_tests(struc return; pr_debug("Validating P4D clear\n"); - p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE); - WRITE_ONCE(*args->p4dp, p4d); + WARN_ON(p4d_none(p4d)); p4d_clear(args->p4dp); p4d = READ_ONCE(*args->p4dp); WARN_ON(!p4d_none(p4d)); @@ -582,8 +565,7 @@ static void __init pgd_clear_tests(struc return; pr_debug("Validating PGD clear\n"); - pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE); - WRITE_ONCE(*args->pgdp, pgd); + WARN_ON(pgd_none(pgd)); pgd_clear(args->pgdp); pgd = READ_ONCE(*args->pgdp); WARN_ON(!pgd_none(pgd)); @@ -634,10 +616,8 @@ static void __init pte_clear_tests(struc if (WARN_ON(!args->ptep)) return; -#ifndef CONFIG_RISCV - pte = __pte(pte_val(pte) | RANDOM_ORVALUE); -#endif set_pte_at(args->mm, args->vaddr, args->ptep, pte); + WARN_ON(pte_none(pte)); flush_dcache_page(page); barrier(); ptep_clear(args->mm, args->vaddr, args->ptep); @@ -650,8 +630,7 @@ static void __init pmd_clear_tests(struc pmd_t pmd = READ_ONCE(*args->pmdp); pr_debug("Validating PMD clear\n"); - pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE); - WRITE_ONCE(*args->pmdp, pmd); + WARN_ON(pmd_none(pmd)); pmd_clear(args->pmdp); pmd = READ_ONCE(*args->pmdp); WARN_ON(!pmd_none(pmd)); _ Patches currently in -mm which might be from peterx@xxxxxxxxxx are mm-debug_vm_pgtable-drop-random_orvalue-trick.patch