On Tue, Mar 04, 2025 at 12:51:27AM -0800, Piotr Jaroszynski wrote: > Update the __flush_tlb_range_op macro not to modify its parameters as > these are unexepcted semantics. In practice, this fixes the call to > mmu_notifier_arch_invalidate_secondary_tlbs() in > __flush_tlb_range_nosync() to use the correct range instead of an empty > range with start=end. The empty range was (un)lucky as it results in > taking the invalidate-all path that doesn't cause correctness issues, > but can certainly result in suboptimal perf. > > This has been broken since commit 6bbd42e2df8f ("mmu_notifiers: call > invalidate_range() when invalidating TLBs") when the call to the > notifiers was added to __flush_tlb_range(). It predates the addition of > the __flush_tlb_range_op() macro from commit 360839027a6e ("arm64: tlb: > Refactor the core flush algorithm of __flush_tlb_range") that made the > bug hard to spot. That's the problem with macros. Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx> Will, do you want to take this as a fix? It's only a performance regression, though you never know how it breaks the callers of the macro at some point. > Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") > > Signed-off-by: Piotr Jaroszynski <pjaroszynski@xxxxxxxxxx> > Cc: Catalin Marinas <catalin.marinas@xxxxxxx> > Cc: Will Deacon <will@xxxxxxxxxx> > Cc: Robin Murphy <robin.murphy@xxxxxxx> > Cc: Alistair Popple <apopple@xxxxxxxxxx> > Cc: Raghavendra Rao Ananta <rananta@xxxxxxxxxx> > Cc: SeongJae Park <sj@xxxxxxxxxx> > Cc: Jason Gunthorpe <jgg@xxxxxxxxxx> > Cc: John Hubbard <jhubbard@xxxxxxxxxx> > Cc: Nicolin Chen <nicolinc@xxxxxxxxxx> > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > Cc: iommu@xxxxxxxxxxxxxxx > Cc: linux-mm@xxxxxxxxx > Cc: linux-kernel@xxxxxxxxxxxxxxx > Cc: stable@xxxxxxxxxxxxxxx > --- > arch/arm64/include/asm/tlbflush.h | 22 ++++++++++++---------- > 1 file changed, 12 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > index bc94e036a26b..8104aee4f9a0 100644 > --- a/arch/arm64/include/asm/tlbflush.h > +++ b/arch/arm64/include/asm/tlbflush.h > @@ -396,33 +396,35 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) > #define __flush_tlb_range_op(op, start, pages, stride, \ > asid, tlb_level, tlbi_user, lpa2) \ > do { \ > + typeof(start) __flush_start = start; \ > + typeof(pages) __flush_pages = pages; \ > int num = 0; \ > int scale = 3; \ > int shift = lpa2 ? 16 : PAGE_SHIFT; \ > unsigned long addr; \ > \ > - while (pages > 0) { \ > + while (__flush_pages > 0) { \ > if (!system_supports_tlb_range() || \ > - pages == 1 || \ > - (lpa2 && start != ALIGN(start, SZ_64K))) { \ > - addr = __TLBI_VADDR(start, asid); \ > + __flush_pages == 1 || \ > + (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) { \ > + addr = __TLBI_VADDR(__flush_start, asid); \ > __tlbi_level(op, addr, tlb_level); \ > if (tlbi_user) \ > __tlbi_user_level(op, addr, tlb_level); \ > - start += stride; \ > - pages -= stride >> PAGE_SHIFT; \ > + __flush_start += stride; \ > + __flush_pages -= stride >> PAGE_SHIFT; \ > continue; \ > } \ > \ > - num = __TLBI_RANGE_NUM(pages, scale); \ > + num = __TLBI_RANGE_NUM(__flush_pages, scale); \ > if (num >= 0) { \ > - addr = __TLBI_VADDR_RANGE(start >> shift, asid, \ > + addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \ > scale, num, tlb_level); \ > __tlbi(r##op, addr); \ > if (tlbi_user) \ > __tlbi_user(r##op, addr); \ > - start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \ > - pages -= __TLBI_RANGE_PAGES(num, scale); \ > + __flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \ > + __flush_pages -= __TLBI_RANGE_PAGES(num, scale);\ > } \ > scale--; \ > } \ > > base-commit: 99fa936e8e4f117d62f229003c9799686f74cebc > -- > 2.22.1.7.gac84d6e93c.dirty -- Catalin