On Fri, 2012-05-25 at 19:02 +0200, Andrea Arcangeli wrote: > The CFS scheduler is still in charge of all scheduling > decisions. AutoNUMA balancing at times will override those. But > generally we'll just relay on the CFS scheduler to keep doing its > thing, but while preferring the autonuma affine nodes when deciding > to move a process to a different runqueue or when waking it up. > > For example the idle balancing, will look into the runqueues of the > busy CPUs, but it'll search first for a task that wants to run into > the idle CPU in AutoNUMA terms (task_autonuma_cpu() being true). > > Most of this is encoded in the can_migrate_task becoming AutoNUMA > aware and running two passes for each balancing pass, the first NUMA > aware, and the second one relaxed. > > The idle/newidle balancing is always allowed to fallback into > non-affine AutoNUMA tasks. The load_balancing (which is more a > fariness than a performance issue) is instead only able to cross over > the AutoNUMA affinity if the flag controlled by > /sys/kernel/mm/autonuma/scheduler/load_balance_strict is not set (it > is set by default). This is unacceptable, and contradicts your earlier claim that you rely on the regular load-balancer. The strict mode needs to go, load-balancing is a best effort and fairness is important -- so much so to some people that I get complaints the current thing isn't strong enough. Your strict mode basically supplants any and all balancing done at node level and above. Please use something like: https://lkml.org/lkml/2012/5/19/53 with the sched_setnode() function from: https://lkml.org/lkml/2012/5/18/109 Fairness matters because people expect similar throughput or runtimes so balancing such that we first ensure equal load on cpus and only then bother with node placement should be the order. Furthermore, load-balancing does things like trying to place tasks that wake each-other closer together, your strict mode completely breaks that. Instead, if the balancer finds these tasks are related and should be together that should be a hint the memory needs to come to them, not the other way around. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href