On Thu, Dec 10, 2020 at 12:11:07PM +0000, Will Deacon wrote: > Passing the range '0, -1' to tlb_gather_mmu() sets the 'fullmm' flag, > which indicates that the mm_struct being operated on is going away. In > this case, some architectures (such as arm64) can elide TLB invalidation > by ensuring that the TLB tag (ASID) associated with this mm is not > immediately reclaimed. Although this behaviour is documented in > asm-generic/tlb.h, it's subtle and easily missed. > > Introduce tlb_gather_mmu_fullmm() to make it clearer that this is for the > entire mm and WARN() if tlb_gather_mmu() is called with the 'fullmm' > address range. > > Signed-off-by: Will Deacon <will@xxxxxxxxxx> > --- > include/asm-generic/tlb.h | 6 ++++-- > include/linux/mm_types.h | 1 + > mm/mmap.c | 2 +- > mm/mmu_gather.c | 16 ++++++++++++++-- > 4 files changed, 20 insertions(+), 5 deletions(-) > > diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h > index 6661ee1cff47..2c68a545ffa7 100644 > --- a/include/asm-generic/tlb.h > +++ b/include/asm-generic/tlb.h > @@ -46,7 +46,9 @@ > * > * The mmu_gather API consists of: > * > - * - tlb_gather_mmu() / tlb_finish_mmu(); start and finish a mmu_gather > + * - tlb_gather_mmu() / tlb_gather_mmu_fullmm() / tlb_finish_mmu() > + * > + * start and finish a mmu_gather > * > * Finish in particular will issue a (final) TLB invalidate and free > * all (remaining) queued pages. > @@ -91,7 +93,7 @@ > * > * - mmu_gather::fullmm > * > - * A flag set by tlb_gather_mmu() to indicate we're going to free > + * A flag set by tlb_gather_mmu_fullmm() to indicate we're going to free > * the entire mm; this allows a number of optimizations. > * > * - We can ignore tlb_{start,end}_vma(); because we don't > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 7b90058a62be..42231729affe 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -585,6 +585,7 @@ static inline cpumask_t *mm_cpumask(struct mm_struct *mm) > struct mmu_gather; > extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, > unsigned long start, unsigned long end); > +extern void tlb_gather_mmu_fullmm(struct mmu_gather *tlb, struct mm_struct *mm); > extern void tlb_finish_mmu(struct mmu_gather *tlb); > > static inline void init_tlb_flush_pending(struct mm_struct *mm) > diff --git a/mm/mmap.c b/mm/mmap.c > index a3e5854cd01e..cdd3dae6547c 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -3214,7 +3214,7 @@ void exit_mmap(struct mm_struct *mm) > > lru_add_drain(); > flush_cache_mm(mm); > - tlb_gather_mmu(&tlb, mm, 0, -1); > + tlb_gather_mmu_fullmm(&tlb, mm); > /* update_hiwater_rss(mm) here? but nobody should be looking */ > /* Use -1 here to ensure all VMAs in the mm are unmapped */ > unmap_vmas(&tlb, vma, 0, -1); > diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c > index b0be5a7aa08f..5f5e45d9eb50 100644 > --- a/mm/mmu_gather.c > +++ b/mm/mmu_gather.c > @@ -261,8 +261,8 @@ void tlb_flush_mmu(struct mmu_gather *tlb) > * respectively when @mm is without users and we're going to destroy > * the full address space (exit/execve). > */ > -void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, > - unsigned long start, unsigned long end) > +static void __tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, > + unsigned long start, unsigned long end) > { > tlb->mm = mm; > > @@ -287,6 +287,18 @@ void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, > inc_tlb_flush_pending(tlb->mm); > } > > +void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, > + unsigned long start, unsigned long end) > +{ > + WARN_ON(!(start | (end + 1))); /* Use _fullmm() instead */ > + __tlb_gather_mmu(tlb, mm, start, end); > +} > + > +void tlb_gather_mmu_fullmm(struct mmu_gather *tlb, struct mm_struct *mm) > +{ > + __tlb_gather_mmu(tlb, mm, 0, -1); > +} > + IMO, there is no point adding the wrappers given you will remove the WARN_ON in the next patch. But if you prefer, they should be at least moved to the header for inlining. Consider this whole series Reviewed-by: Yu Zhao <yuzhao@xxxxxxxxxx>