On Mon, Jan 29, 2018 at 06:36:05PM +0000, Mathieu Desnoyers wrote: > ----- On Jan 29, 2018, at 1:15 PM, Peter Zijlstra peterz@xxxxxxxxxxxxx wrote: > > Aaah, its the case where we do not pass through switch_mm(), the partial > > comment got to me. I only realized after reading the next patch. > > Indeed, if we read the entire comment, it's made clear that this case is for > when switch_mm is not invoked, where the current mm is changed without going > through switch_mm(), when scheduling between uthread->kthread->uthread for > instance. > > /* > * When transitioning from a kernel thread to a userspace > * thread, mmdrop()'s implicit full barrier is required by the > * membarrier system call, because the current active_mm can > * become the current mm without going through switch_mm(). > * membarrier also requires a core serializing instruction > * before going back to user-space after storing to rq->curr. > */ > > Is there something I should improve in the wording of this added > sentence to make it clearer ? Can be improved I think, its got two unqualified "membarrier"s in and its a bit mixed up. I'm having a major case of the mondays (brain just won't start today), but maybe something like: When we switched through a kernel thread, the loop in membarrier_{private,global}_expedited() can have observed that kernel thread and not issued an IPI. We will also not pass through switch_mm(). Membarrier requires a barrier after writing rq->curr and returning to userspace, so provide them here: - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED - a sync_core for SYNC_CORE Also I think changing the changlog to state where we need core-sync would be good. Currently the x86 patch does that, but not this one, while this introduces the feature.