On Thu, May 30, 2013 at 10:34:53AM +0530, Vineet Gupta wrote: > On 05/29/2013 11:21 PM, Peter Zijlstra wrote: > > What about something like this? > > > > --- > > include/asm-generic/tlb.h | 11 ++++++++++- > > mm/memory.c | 17 ++++++++++++++++- > > 2 files changed, 26 insertions(+), 2 deletions(-) > > > > diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h > > index b1b1fa6..651b1cf 100644 > > --- a/include/asm-generic/tlb.h > > +++ b/include/asm-generic/tlb.h > > @@ -116,6 +116,7 @@ struct mmu_gather { > > > > static inline int tlb_fast_mode(struct mmu_gather *tlb) > > { > > +#ifndef CONFIG_PREEMPT > > #ifdef CONFIG_SMP > > return tlb->fast_mode; > > #else > > @@ -124,7 +125,15 @@ static inline int tlb_fast_mode(struct mmu_gather *tlb) > > * and page free order so much.. > > */ > > return 1; > > -#endif > > +#endif /* CONFIG_SMP */ > > +#else /* CONFIG_PREEMPT */ > > + /* > > + * Since mmu_gather is preemptible, preemptible kernels are like SMP > > + * kernels, we must batch to make sure we invalidate TLBs before we > > + * free the pages. > > + */ > > + return 0; > > +#endif /* CONFIG_PREEMPT */ > > } > > So this adds the page batching logic to small/simpler UP systems - but it's > necessary evil :-( Well, only if you have CONFIG_PREEMPT=y, at which point UP isn't really _that_ simply anymore really. > > void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, bool fullmm); > > diff --git a/mm/memory.c b/mm/memory.c > > index 6dc1882..e915af2 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -384,6 +384,21 @@ void tlb_remove_table(struct mmu_gather *tlb, void *table) > > > > #endif /* CONFIG_HAVE_RCU_TABLE_FREE */ > > > > +static inline void cond_resched_tlb(struct mmu_gather *tlb) > > +{ > > +#ifndef CONFIG_PREEMPT > > + /* > > + * For full preempt kernels we must do regular batching like > > + * SMP, see tlb_fast_mode(). For !PREEMPT we can 'cheat' and > > + * do a flush before our voluntary 'yield'. > > + */ > > + if (need_resched()) { > > This is really neat: w/o this check, a @fullmm flush (exit/execve) would have > suffered multiple full TLB flushes in the loop, now you do that only if a > scheduling was needed - meaning only in the case when we have the potential race > condition which Max was seeing. Cool ! Glad you like it ;-) > > + tlb_flush_mmu(tlb); > > + cond_resched(); > > + } > > +#endif > > +} > > + > > /* > > * If a p?d_bad entry is found while walking page tables, report > > * the error, before resetting entry to p?d_none. Usually (but > > @@ -1264,7 +1279,7 @@ static inline unsigned long zap_pmd_range(struct mmu_gather *tlb, > > goto next; > > next = zap_pte_range(tlb, vma, pmd, addr, next, details); > > next: > > - cond_resched(); > > + cond_resched_tlb(tlb); > > } while (pmd++, addr = next, addr != end); > > > > return addr; > > BTW, since we are on the topic, it seems that we are missing tlb_fast_mode() in > one spot - unless it is tied to rcu table free stuff. > > --------------> > From: Vineet Gupta <vgupta@xxxxxxxxxxxx> > Date: Thu, 30 May 2013 10:25:30 +0530 > Subject: [PATCH] mm: tlb_fast_mode check missing in tlb_finish_mmu() > > Signed-off-by: Vineet Gupta <vgupta@xxxxxxxxxxxx> > --- > mm/memory.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/memory.c b/mm/memory.c > index d9d5fd9..569ffe1 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -269,6 +269,9 @@ void tlb_finish_mmu(struct mmu_gather *tlb, unsigned long > start, unsigned long e > /* keep the page table cache within bounds */ > check_pgt_cache(); > > + if (tlb_fast_mode(tlb)) > + return; > + > for (batch = tlb->local.next; batch; batch = next) { > next = batch->next; > free_pages((unsigned long)batch, 0); Yes I think that is possible. It would shrink the code a little when fast_mode was unconditionally 1 -- ie. simple UP ;-). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>