Hi Peter, On Thu, Sep 13, 2018 at 11:21:17AM +0200, Peter Zijlstra wrote: > Generic mmu_gather provides everything that ARM needs: > > - range tracking > - RCU table free > - VM_EXEC tracking > - VIPT cache flushing > > The one notable curiosity is the 'funny' range tracking for classical > ARM in __pte_free_tlb(). > > Cc: Will Deacon <will.deacon@xxxxxxx> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Nick Piggin <npiggin@xxxxxxxxx> > Cc: Russell King <linux@xxxxxxxxxxxxxxx> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > --- > arch/arm/include/asm/tlb.h | 255 ++------------------------------------------- > 1 file changed, 14 insertions(+), 241 deletions(-) So whilst I was reviewing this, I realised that I think we should be selecting HAVE_RCU_TABLE_INVALIDATE for arch/arm/ if HAVE_RCU_TABLE_FREE. Whilst we don't distinguish between invalidation of intermediate and leaf levels on 32-bit, the CPU is still permitted to cache partial translation table walks even if the leaf entry indicates a fault. That means that after tearing down the PTEs, we can still get walk cache allocations and so if the RCU batching of the page tables fails, we need to invalidate the TLB after clearing the intermediate entries but before freeing them. > -static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, > - unsigned long addr) > +__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr) > { > pgtable_page_dtor(pte); > > -#ifdef CONFIG_ARM_LPAE > - tlb_add_flush(tlb, addr); > -#else > +#ifndef CONFIG_ARM_LPAE > /* > * With the classic ARM MMU, a pte page has two corresponding pmd > * entries, each covering 1MB. > */ > - addr &= PMD_MASK; > - tlb_add_flush(tlb, addr + SZ_1M - PAGE_SIZE); > - tlb_add_flush(tlb, addr + SZ_1M); > + addr = (addr & PMD_MASK) + SZ_1M; > + __tlb_adjust_range(tlb, addr - PAGE_SIZE, addr + PAGE_SIZE); Hmm, I don't think you've got the range correct here. Don't we want something like: __tlb_adjust_range(tlb, addr - PAGE_SIZE, 2 * PAGE_SIZE) to ensure that we flush on both sides of the 1M boundary? Will