On 2/6/25 18:34, Ryan Roberts wrote: > On 06/02/2025 06:46, Anshuman Khandual wrote: >> >> >> On 2/5/25 20:39, Ryan Roberts wrote: >>> commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for >>> FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to >>> TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in >>> flush_hugetlb_tlb_range() was not updated. So as it stands, when trying >>> to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will >>> spuriously try to invalidate at level 0 on LPA2-enabled systems. >>> >>> Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are >>> at it, explicitly use the correct stride and level for CONT_PMD_SIZE and >>> CONT_PTE_SIZE, which should provide a minor optimization. >>> >>> Cc: <stable@xxxxxxxxxxxxxxx> >>> Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2") >>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx> >>> --- >>> arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------ >>> 1 file changed, 14 insertions(+), 6 deletions(-) >>> >>> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h >>> index 03db9cb21ace..8ab9542d2d22 100644 >>> --- a/arch/arm64/include/asm/hugetlb.h >>> +++ b/arch/arm64/include/asm/hugetlb.h >>> @@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma, >>> { >>> unsigned long stride = huge_page_size(hstate_vma(vma)); >>> >>> - if (stride == PMD_SIZE) >>> - __flush_tlb_range(vma, start, end, stride, false, 2); >>> - else if (stride == PUD_SIZE) >>> - __flush_tlb_range(vma, start, end, stride, false, 1); >>> - else >>> - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); >>> + switch (stride) { >>> + case PUD_SIZE: >>> + __flush_tlb_range(vma, start, end, PUD_SIZE, false, 1); >>> + break; >> >> Just wondering - should not !__PAGETABLE_PMD_FOLDED and pud_sect_supported() >> checks also be added here for this PUD_SIZE case ? > > Yeah I guess so. TBH, it's never been entirely clear to me what the benefit is? > Is it just to remove (a tiny amount of) dead code when we know we don't support > blocks at the level? Or is there something more fundamental going on that I've > missed? There is a generic fallback for PUD_SIZE in include/asm-generic/pgtable-nopud.h when it is not defined on arm64 platform and pud_sect_supported() might also get optimized by the compiler. static inline bool pud_sect_supported(void) { return PAGE_SIZE == SZ_4K; } IIUC this just saves dead code from being compiled as you mentioned. > > We seem to be quite inconsistent with the use of pud_sect_supported() in > hugetlbpage.c. PUD_SIZE switch cases in hugetlb_mask_last_page() and arch_make_huge_pte() ? Those should be fixed. > > Anyway, I'll add this in, I guess it's preferable to follow the established pattern. Agreed. > > Thanks, > Ryan > >> >>> + case CONT_PMD_SIZE: >>> + case PMD_SIZE: >>> + __flush_tlb_range(vma, start, end, PMD_SIZE, false, 2); >>> + break; >>> + case CONT_PTE_SIZE: >>> + __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3); >>> + break; >>> + default: >>> + __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); >>> + } >>> } >>> >>> #endif /* __ASM_HUGETLB_H */ >