On Wed, 2017-10-04 at 18:18 +0200, Peter Zijlstra wrote: > On Tue, Oct 03, 2017 at 10:39:32AM +0200, Peter Zijlstra wrote: > > So I was waiting for Rik, who promised to run a bunch of NUMA > > workloads > > over the weekend. > > > > The trivial thing regresses a wee bit on the overloaded case, I've > > not > > yet tried to fix it. > > WA_IDLE is my 'old' patch and what you all tested, WA_WEIGHT is the > addition -- based on the old scheme -- that I've tried in order to > lift > the overloaded case (including hackbench). > > Its not an unconditional win, but I'm tempted to default enable > WA_WEIGHT too (I've not done NO_WA_IDLE && WA_WEIGHT runs). Enabling both makes sense to me. We have four cases to deal with: - mostly idle system, in that case we don't really care, since select_idle_sibling will find an idle core anywhere - partially loaded system (say 1/2 or 2/3 full), in that case WA_IDLE will be a great policy to help locate an idle CPU - fully loaded system, in this case either policy works well - overloaded system, in this case WA_WEIGHT seems to do the trick, assuming load balancing results in largely similar loads between cores inside each LLC The big danger is affine wakeups messing up the balance the load balancer works on, with the two mechanisms messing up each other's placement. However, there seems to be very little we can actually do about that, without the unacceptable overhead of examining the instantaneous loads on every CPU in an LLC - otherwise we end up either overshooting, or not taking advantage of idle CPUs, due to the use of cached load values. -- All rights reversed
Attachment:
signature.asc
Description: This is a digitally signed message part