The patch titled Subject: mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page has been added to the -mm tree. Its filename is mm-vmalloc-fix-huge_vmap-regression-by-enabling-huge-pages-in-vmalloc_to_page.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mm-vmalloc-fix-huge_vmap-regression-by-enabling-huge-pages-in-vmalloc_to_page.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/mm-vmalloc-fix-huge_vmap-regression-by-enabling-huge-pages-in-vmalloc_to_page.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Nicholas Piggin <npiggin@xxxxxxxxx> Subject: mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page vmalloc_to_page returns NULL for addresses mapped by larger pages[*]. Whether or not a vmap is huge depends on the architecture details, alignments, boot options, etc., which the caller can not be expected to know. Therefore HUGE_VMAP is a regression for vmalloc_to_page. This change teaches vmalloc_to_page about larger pages, and returns the struct page that corresponds to the offset within the large page. This makes the API agnostic to mapping implementation details. [*] As explained by commit 029c54b095995 ("mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings") Link: https://lkml.kernel.org/r/20210317062402.533919-3-npiggin@xxxxxxxxx Signed-off-by: Nicholas Piggin <npiggin@xxxxxxxxx> Reviewed-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Cc: Borislav Petkov <bp@xxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Ding Tianhong <dingtianhong@xxxxxxxxxx> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx> Cc: Russell King <linux@xxxxxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx> Cc: Will Deacon <will@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/vmalloc.c | 41 ++++++++++++++++++++++++++--------------- 1 file changed, 26 insertions(+), 15 deletions(-) --- a/mm/vmalloc.c~mm-vmalloc-fix-huge_vmap-regression-by-enabling-huge-pages-in-vmalloc_to_page +++ a/mm/vmalloc.c @@ -34,7 +34,7 @@ #include <linux/bitops.h> #include <linux/rbtree_augmented.h> #include <linux/overflow.h> - +#include <linux/pgtable.h> #include <linux/uaccess.h> #include <asm/tlbflush.h> #include <asm/shmparam.h> @@ -343,7 +343,9 @@ int is_vmalloc_or_module_addr(const void } /* - * Walk a vmap address to the struct page it maps. + * Walk a vmap address to the struct page it maps. Huge vmap mappings will + * return the tail page that corresponds to the base page address, which + * matches small vmap mappings. */ struct page *vmalloc_to_page(const void *vmalloc_addr) { @@ -363,25 +365,33 @@ struct page *vmalloc_to_page(const void if (pgd_none(*pgd)) return NULL; + if (WARN_ON_ONCE(pgd_leaf(*pgd))) + return NULL; /* XXX: no allowance for huge pgd */ + if (WARN_ON_ONCE(pgd_bad(*pgd))) + return NULL; + p4d = p4d_offset(pgd, addr); if (p4d_none(*p4d)) return NULL; - pud = pud_offset(p4d, addr); + if (p4d_leaf(*p4d)) + return p4d_page(*p4d) + ((addr & ~P4D_MASK) >> PAGE_SHIFT); + if (WARN_ON_ONCE(p4d_bad(*p4d))) + return NULL; - /* - * Don't dereference bad PUD or PMD (below) entries. This will also - * identify huge mappings, which we may encounter on architectures - * that define CONFIG_HAVE_ARCH_HUGE_VMAP=y. Such regions will be - * identified as vmalloc addresses by is_vmalloc_addr(), but are - * not [unambiguously] associated with a struct page, so there is - * no correct value to return for them. - */ - WARN_ON_ONCE(pud_bad(*pud)); - if (pud_none(*pud) || pud_bad(*pud)) + pud = pud_offset(p4d, addr); + if (pud_none(*pud)) + return NULL; + if (pud_leaf(*pud)) + return pud_page(*pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); + if (WARN_ON_ONCE(pud_bad(*pud))) return NULL; + pmd = pmd_offset(pud, addr); - WARN_ON_ONCE(pmd_bad(*pmd)); - if (pmd_none(*pmd) || pmd_bad(*pmd)) + if (pmd_none(*pmd)) + return NULL; + if (pmd_leaf(*pmd)) + return pmd_page(*pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); + if (WARN_ON_ONCE(pmd_bad(*pmd))) return NULL; ptep = pte_offset_map(pmd, addr); @@ -389,6 +399,7 @@ struct page *vmalloc_to_page(const void if (pte_present(pte)) page = pte_page(pte); pte_unmap(ptep); + return page; } EXPORT_SYMBOL(vmalloc_to_page); _ Patches currently in -mm which might be from npiggin@xxxxxxxxx are arm-mm-add-missing-pud_page-define-to-2-level-page-tables.patch mm-vmalloc-fix-huge_vmap-regression-by-enabling-huge-pages-in-vmalloc_to_page.patch mm-apply_to_pte_range-warn-and-fail-if-a-large-pte-is-encountered.patch mm-vmalloc-rename-vmap__range-vmap_pages__range.patch mm-ioremap-rename-ioremap__range-to-vmap__range.patch mm-huge_vmap-arch-support-cleanup.patch powerpc-inline-huge-vmap-supported-functions.patch arm64-inline-huge-vmap-supported-functions.patch x86-inline-huge-vmap-supported-functions.patch mm-vmalloc-provide-fallback-arch-huge-vmap-support-functions.patch mm-move-vmap_range-from-mm-ioremapc-to-mm-vmallocc.patch mm-vmalloc-add-vmap_range_noflush-variant.patch mm-vmalloc-hugepage-vmalloc-mappings.patch powerpc-64s-radix-enable-huge-vmalloc-mappings.patch