On Fri, Nov 30, 2012 at 12:37:49PM -0800, Linus Torvalds wrote: > On Fri, Nov 30, 2012 at 11:58 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote: > > > > When pushed hard enough via threaded workloads (for example via the > > numa02 test) then the upstream page migration code in mm/migration.c > > becomes unscalable, resulting in lot of scheduling on the anon vma > > mutex and a subsequent drop in performance. > > Ugh. > > I wonder if migration really needs that thing to be a mutex? I may be > wrong, but the anon_vma lock only protects the actual rmap chains, and > migration only ever changes the pte *contents*, not the actual chains > of pte's themselves, right? > Pretty much. As far as migration is concerned all that is critical is that it find all the old migration ptes and restore them. If any of them are missed then it will likely BUG later when the page is faulted. If a process happened to exit while the anon_vma mutex was not held and the migration pte and anon_vma disappeared during migration, it would not matter as such. If the protection was a rwsem then migration might cause delays in a parallel unmap or exit until the migration completed but I doubt it would ever be noticed. > So if this is a migration-specific scalability issue, then it might be > possible to solve by making the mutex be a rwsem instead, and have > migration only take it for reading. > > Of course, I'm quite possibly wrong, and the code depends on full > mutual exclusion. > > Just a thought, in case it makes somebody go "Hmm.." > Offhand, I cannot think of a reason why a rwsem would not work. This thing originally became a mutex because the RT people (Peter in particular) cared about being able to preempt faster. It'd be nice if they confirmed that rwsem is not be a problem for them. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>