Re: [PATCH 08/20] mm: Optimize fullmm TLB flushing

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Thu, 28 Jun 2012 18:20:25 +0200

On Thu, 2012-06-28 at 15:53 +0100, Catalin Marinas wrote:

> > Yes they do.. its just the up-front TLB invalidate for fullmm that's a
> > problem.
> 
> The upfront invalidate is fine (i.e. harmless), it's the tlb_flush_mmu()
> change to check for !tlb->fullmm that's not helpful on ARM.

I think we're saying the same but differently. The point is that the
flush up front isn't sufficient for most of us.

Also, we'd very much want to avoid superfluous flushes since they are
somewhat expensive.

How horrid is something like the below. It detaches the mm so that
hardware speculation simply doesn't matter.

Now the switch_mm should imply the same cache+TBL flush we'd otherwise
do, and I'd think that that would be the majority of the cost. Am I
wrong there?

Also, the below seems to leak mm_structs so I did mess up the
ref-counting, its too bloody hot here.



---
 mm/memory.c |   51 +++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 47 insertions(+), 4 deletions(-)

--- a/mm/memory.c
+++ b/mm/memory.c
@@ -65,6 +65,7 @@
 #include <asm/tlb.h>
 #include <asm/tlbflush.h>
 #include <asm/pgtable.h>
+#include <asm/mmu_context.h>
 
 #include "internal.h"
 
@@ -197,6 +198,33 @@ static int tlb_next_batch(struct mmu_gat
 	return 1;
 }
 
+/*
+ * Anonymize the task by detaching the mm and attaching it
+ * to the init_mm.
+ */
+static void detach_mm(struct mm_struct *mm, struct task_struct *tsk)
+{
+	/*
+	 * We should only be called when there's no users left and we're
+	 * destroying the mm.
+	 */
+	VM_BUG_ON(atomic_read(&mm->mm_users));
+	VM_BUG_ON(tsk->mm != mm);
+	VM_BUG_ON(mm == &init_mm);
+
+	task_lock(tsk);
+	tsk->mm = NULL;
+	tsk->active_mm = &init_mm;
+	switch_mm(mm, &init_mm, tsk);
+	/*
+	 * We have to take an extra ref on init_mm for TASK_DEAD in
+	 * finish_task_switch(), we don't drop our mm->mm_count reference
+	 * since mmput() will do this.
+	 */
+	atomic_inc(&init_mm.mm_count);
+	task_unlock(tsk);
+}
+
 /* tlb_gather_mmu
  *	Called to initialize an (on-stack) mmu_gather structure for page-table
  *	tear-down from @mm. The @fullmm argument is used when @mm is without
@@ -215,16 +243,31 @@ void tlb_gather_mmu(struct mmu_gather *t
 	tlb->active     = &tlb->local;
 
 	tlb_table_init(tlb);
+
+	if (fullmm && current->mm == mm) {
+		/*
+		 * Instead of doing:
+		 *
+		 *  flush_cache_mm(mm);
+		 *  flush_tlb_mm(mm);
+		 *
+		 * We switch to init_mm, this context switch should imply both
+		 * the cache and TLB flush as well as guarantee that hardware
+		 * speculation cannot load TLBs on this mm anymore.
+		 */
+		detach_mm(mm, current);
+	}
 }
 
 void tlb_flush_mmu(struct mmu_gather *tlb)
 {
 	struct mmu_gather_batch *batch;
 
-	if (!tlb->need_flush)
-		return;
-	tlb->need_flush = 0;
-	flush_tlb_mm(tlb->mm);
+	if (!tlb->fullmm && tlb->need_flush) {
+		tlb->need_flush = 0;
+		flush_tlb_mm(tlb->mm);
+	}
+
 	tlb_table_flush(tlb);
 
 	if (tlb_fast_mode(tlb))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href