----- On Jan 28, 2019, at 3:46 PM, paulmck paulmck@xxxxxxxxxxxxx wrote: > On Mon, Jan 28, 2019 at 12:27:03PM -0800, Linus Torvalds wrote: >> On Mon, Jan 28, 2019 at 10:27 AM Mathieu Desnoyers >> <mathieu.desnoyers@xxxxxxxxxxxx> wrote: >> > >> > Jann Horn identified a racy access to p->mm in the global expedited >> > command of the membarrier system call. >> > >> > The suggested fix is to hold the task_lock() around the accesses to >> > p->mm and to the mm_struct membarrier_state field to guarantee the >> > existence of the mm_struct. >> >> Hmm. I think this is right. You shouldn't access another threads mm >> pointer without proper locking. >> >> That said, we *could* make the mm_cachep be SLAB_TYPESAFE_BY_RCU, >> which would allow speculatively reading data off the mm pointer under >> RCU. It might not be the *right* mm if somebody just did an exit, but >> for things like this it shouldn't matter. > > That sounds much simpler and more effective than the contention-reduction > approach that I suggested. ;-) I'd be tempted to stick to the locking approach for a fix, and implement Linus' type-safe mm_cachep idea if anyone complains about the overhead of membarrier GLOBAL_EXPEDITED (and submit for a future merge window). I tested the KASAN splat reproducer from Jann locally, and confirmed that my patch fixes the issue it reproduces. Please let me know if the task_lock() approach is OK as a fix for now. I'm also awaiting a Tested-by from Jann before submitting this for real. Thanks, Mathieu > > Thanx, Paul > >> But if this is the only case that might care, it sounds like just >> doing the proper locking is the right approach. >> >> Linus -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com