On Tue, Jul 28, 2020 at 12:00:09PM -0400, Mathieu Desnoyers wrote: > exit_mm should issue memory barriers after user-space memory accesses, > before clearing current->mm, to order user-space memory accesses > performed prior to exit_mm before clearing tsk->mm, which has the > effect of skipping the membarrier private expedited IPIs. > > The membarrier system call can be issued concurrently with do_exit > if we have thread groups created with CLONE_VM but not CLONE_THREAD. I'm still wonder what the exact failure case is though; exit_mm() is on the exit path (as the name very much implies) and the thread is about to die. The context switch that follows guarantees a full barrier before we run anything else again. > The following comment in exit_mm() seems rather puzzling though: > > /* more a memory barrier than a real lock */ > task_lock(current); > > So considering that spinlocks are not full memory barriers nowadays, > some digging into the origins of this comment led me to 2002, at the > earliest commit tracked by Thomas Gleixner's bitkeeper era's history > at https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/ > > At that point, this comment was followed by: > > + /* more a memory barrier than a real lock */ > + task_lock(tsk); > + tsk->mm = NULL; > + task_unlock(tsk); > > Which seems to indicate that grabbing the lock is really about acting as > a memory barrier, but I wonder if it is meant to be a full barrier or > just an acquire. > > If a full memory barrier happens to be needed even without membarrier, > perhaps this fix also corrects other issues ? It unclear from the > comment what this memory barrier orders though. Is it the chain > exit -> waitpid, or is it related to entering lazy TLB ? I'm as puzzled by that comment as are you. It does seem to be required for glorious stuff like get_task_mm() though. > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> > Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > Cc: Will Deacon <will@xxxxxxxxxx> > Cc: Paul E. McKenney <paulmck@xxxxxxxxxx> > Cc: Nicholas Piggin <npiggin@xxxxxxxxx> > Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Cc: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> > Cc: linux-mm@xxxxxxxxx > --- > kernel/exit.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/kernel/exit.c b/kernel/exit.c > index 727150f28103..ce272ec55cdc 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -474,6 +474,14 @@ static void exit_mm(void) > BUG_ON(mm != current->active_mm); > /* more a memory barrier than a real lock */ > task_lock(current); > + /* > + * When a thread stops operating on an address space, the loop > + * in membarrier_{private,global}_expedited() may not observe > + * that tsk->mm, and not issue an IPI. Membarrier requires a > + * memory barrier after accessing user-space memory, before > + * clearing tsk->mm. > + */ > + smp_mb(); So like stated above, I'm not sure how missing that IPI is going to be a problem, we're dying... > current->mm = NULL; > mmap_read_unlock(mm); > enter_lazy_tlb(mm, current); > -- > 2.11.0 >