On Mon, Jul 29, 2019 at 11:16:55AM -0400, Waiman Long wrote: > On 7/29/19 10:24 AM, Peter Zijlstra wrote: > > On Mon, Jul 29, 2019 at 10:52:35AM +0200, Peter Zijlstra wrote: > > > > --- > > Subject: sched: Clean up active_mm reference counting > > From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > Date: Mon Jul 29 16:05:15 CEST 2019 > > > > The current active_mm reference counting is confusing and sub-optimal. > > > > Rewrite the code to explicitly consider the 4 separate cases: > > > > user -> user > > > > When switching between two user tasks, all we need to consider > > is switch_mm(). > > > > user -> kernel > > > > When switching from a user task to a kernel task (which > > doesn't have an associated mm) we retain the last mm in our > > active_mm. Increment a reference count on active_mm. > > > > kernel -> kernel > > > > When switching between kernel threads, all we need to do is > > pass along the active_mm reference. > > > > kernel -> user > > > > When switching between a kernel and user task, we must switch > > from the last active_mm to the next mm, hoping of course that > > these are the same. Decrement a reference on the active_mm. > > > > The code keeps a different order, because as you'll note, both 'to > > user' cases require switch_mm(). > > > > And where the old code would increment/decrement for the 'kernel -> > > kernel' case, the new code observes this is a neutral operation and > > avoids touching the reference count. > > I am aware of that behavior which is indeed redundant, but it is not > what I am trying to fix and so I kind of leave it alone in my patch. Oh sure; and it's not all that important either. It is jst that every time I look at that code I get confused. On top of that, the new is easier to rip the active_mm stuff out of, which is where it came from.