* Peter Zijlstra <peterz@xxxxxxxxxxxxx> [2013-07-30 11:10:21]: > On Tue, Jul 30, 2013 at 02:33:45PM +0530, Srikar Dronamraju wrote: > > * Peter Zijlstra <peterz@xxxxxxxxxxxxx> [2013-07-30 10:20:01]: > > > > > On Tue, Jul 30, 2013 at 10:17:55AM +0200, Peter Zijlstra wrote: > > > > On Tue, Jul 30, 2013 at 01:18:15PM +0530, Srikar Dronamraju wrote: > > > > > Here is an approach that looks to consolidate workloads across nodes. > > > > > This results in much improved performance. Again I would assume this work > > > > > is complementary to Mel's work with numa faulting. > > > > > > > > I highly dislike the use of task weights here. It seems completely > > > > unrelated to the problem at hand. > > > > > > I also don't particularly like the fact that it's purely process based. > > > The faults information we have gives much richer task relations. > > > > > > > With just pure fault information based approach, I am not seeing any > > major improvement in tasks/memory consolidation. I still see memory > > spread across different nodes and tasks getting ping-ponged to different > > nodes. And if there are multiple unrelated processes, then we see a mix > > of tasks of different processes in each of the node. > > The fault thing isn't finished. Mel explicitly said it doesn't yet have > inter-task relations. And you run everything in a VM which is like a big > nasty mangler for anything sane. > I am not against fault and fault based handling is very much needed. I have listed that this approach is complementary to numa faults that Mel is proposing. Right now I think if we can first get the tasks to consolidate on nodes and then use the numa faults to place the tasks, then we would be able to have a very good solution. Plain fault information is actually causing confusion in enough number of cases esp if the initial set of pages is all consolidated into fewer set of nodes. With plain fault information, memory follows cpu, cpu follows memory are conflicting with each other. memory wants to move to nodes where the tasks are currently running and the tasks are planning to move nodes where the current memory is around. Also most of the consolidation that I have proposed is pretty conservative or either done at idle balance time. This would not affect the numa faulting in any way. When I run with my patches (along with some debug code), the consolidation happens pretty pretty quickly. Once consolidation has happened, numa faults would be of immense value. Here is how I am looking at the solution. 1. Till the initial scan delay, allow tasks to consolidate 2. After the first scan delay to the next scan delay, account numa faults, allow memory to move. But dont use numa faults as yet to drive scheduling decisions. Here also task continue to consolidate. This will lead to tasks and memory moving to specific nodes and leading to consolidation. 3. After the second scan delay, continue to account numa faults and allow numa faults to drive scheduling decisions. Should we use also use task weights at stage 3 or just numa faults or which one should get more preference is something that I am not clear at this time. At this time, I would think we would need to factor in both of them. I think this approach would mean tasks get consolidated but the inter process, inter task relations that you are looking for also remain strong. Is this a acceptable solution? -- Thanks and Regards Srikar -- Thanks and Regards Srikar Dronamraju -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>