On Tue, Feb 19, 2019 at 11:31:53AM +0100, Peter Zijlstra wrote: > When an architecture does not have (an efficient) flush_tlb_range(), > but instead always uses full TLB invalidates, the current generic > tlb_flush() is sub-optimal, for it will generate extra flushes in > order to keep the range small. > > But if we cannot do range flushes, that is a moot concern. Optionally > provide this simplified default. > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > --- > include/asm-generic/tlb.h | 41 ++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 40 insertions(+), 1 deletion(-) > > --- a/include/asm-generic/tlb.h > +++ b/include/asm-generic/tlb.h > @@ -114,7 +114,8 @@ > * returns the smallest TLB entry size unmapped in this range. > * > * If an architecture does not provide tlb_flush() a default implementation > - * based on flush_tlb_range() will be used. > + * based on flush_tlb_range() will be used, unless MMU_GATHER_NO_RANGE is > + * specified, in which case we'll default to flush_tlb_mm(). > * > * Additionally there are a few opt-in features: > * > @@ -140,6 +141,9 @@ > * the page-table pages. Required if you use HAVE_RCU_TABLE_FREE and your > * architecture uses the Linux page-tables natively. > * > + * MMU_GATHER_NO_RANGE > + * > + * Use this if your architecture lacks an efficient flush_tlb_range(). > */ > #define HAVE_GENERIC_MMU_GATHER > > @@ -302,12 +306,45 @@ static inline void __tlb_reset_range(str > */ > } > > +#ifdef CONFIG_MMU_GATHER_NO_RANGE > + > +#if defined(tlb_flush) || defined(tlb_start_vma) || defined(tlb_end_vma) > +#error MMU_GATHER_NO_RANGE relies on default tlb_flush(), tlb_start_vma() and tlb_end_vma() > +#endif > + > +/* > + * When an architecture does not have efficient means of range flushing TLBs > + * there is no point in doing intermediate flushes on tlb_end_vma() to keep the > + * range small. We equally don't have to worry about page granularity or other > + * things. > + * > + * All we need to do is issue a full flush for any !0 range. > + */ > +static inline void tlb_flush(struct mmu_gather *tlb) > +{ > + if (tlb->end) > + flush_tlb_mm(tlb->mm); > +} I guess another way we could handle these architectures is by unconditionally resetting tlb->fullmm to 1, but this works too. Acked-by: Will Deacon <will.deacon@xxxxxxx> Will