Hi, Srikar, Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx> writes: > * Huang, Ying <ying.huang@xxxxxxxxx> [2019-07-25 16:01:24]: > >> From: Huang Ying <ying.huang@xxxxxxxxx> >> >> From the commit log and comments of commit 37ec97deb3a8 ("sched/numa: >> Slow down scan rate if shared faults dominate"), the autonuma scan >> period should be increased (scanning is slowed down) if the majority >> of the page accesses are shared with other processes. But in current >> code, the scan period will be decreased (scanning is speeded up) in >> that situation. >> >> The commit log and comments make more sense. So this patch fixes the >> code to make it match the commit log and comments. And this has been >> verified via tracing the scan period changing and /proc/vmstat >> numa_pte_updates counter when running a multi-threaded memory >> accessing program (most memory areas are accessed by multiple >> threads). >> > > Lets split into 4 modes. > More Local and Private Page Accesses: > We definitely want to scan slowly i.e increase the scan window. > > More Local and Shared Page Accesses: > We still want to scan slowly because we have consolidated and there is no > point in scanning faster. So scan slowly + increase the scan window. > (Do remember access on any active node counts as local!!!) > > More Remote + Private page Accesses: > Most likely the Private accesses are going to be local accesses. > > In the unlikely event of the private accesses not being local, we should > scan faster so that the memory and task consolidates. > > More Remote + Shared page Accesses: This means the workload has not > consolidated and needs to scan faster. So we need to scan faster. This sounds reasonable. But lr_ratio < NUMA_PERIOD_THRESHOLD doesn't indicate More Remote. If Local = Remote, it is also true. If there are also more Shared, we should slow down the scanning. So, the logic could be if (lr_ratio >= NUMA_PERIOD_THRESHOLD) slow down scanning else if (sp_ratio >= NUMA_PERIOD_THRESHOLD) { if (NUMA_PERIOD_SLOTS - lr_ratio >= NUMA_PERIOD_THRESHOLD) speed up scanning else slow down scanning } else speed up scanning This follows your idea better? Best Regards, Huang, Ying > So I would think we should go back to before 37ec97deb3a8. > > i.e > > int slot = lr_ratio - NUMA_PERIOD_THRESHOLD; > > if (!slot) > slot = 1; > diff = slot * period_slot; > > > No? > >> Fixes: 37ec97deb3a8 ("sched/numa: Slow down scan rate if shared faults dominate") >> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> >> Cc: Rik van Riel <riel@xxxxxxxxxx> >> Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> >> Cc: Mel Gorman <mgorman@xxxxxxx> >> Cc: jhladky@xxxxxxxxxx >> Cc: lvenanci@xxxxxxxxxx >> Cc: Ingo Molnar <mingo@xxxxxxxxxx> >> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >> --- >> kernel/sched/fair.c | 20 ++++++++++---------- >> 1 file changed, 10 insertions(+), 10 deletions(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 036be95a87e9..468a1c5038b2 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -1940,7 +1940,7 @@ static void update_task_scan_period(struct task_struct *p, >> unsigned long shared, unsigned long private) >> { >> unsigned int period_slot; >> - int lr_ratio, ps_ratio; >> + int lr_ratio, sp_ratio; >> int diff; >> >> unsigned long remote = p->numa_faults_locality[0]; >> @@ -1971,22 +1971,22 @@ static void update_task_scan_period(struct task_struct *p, >> */ >> period_slot = DIV_ROUND_UP(p->numa_scan_period, NUMA_PERIOD_SLOTS); >> lr_ratio = (local * NUMA_PERIOD_SLOTS) / (local + remote); >> - ps_ratio = (private * NUMA_PERIOD_SLOTS) / (private + shared); >> + sp_ratio = (shared * NUMA_PERIOD_SLOTS) / (private + shared); >> >> - if (ps_ratio >= NUMA_PERIOD_THRESHOLD) { >> + if (sp_ratio >= NUMA_PERIOD_THRESHOLD) { >> /* >> - * Most memory accesses are local. There is no need to >> - * do fast NUMA scanning, since memory is already local. >> + * Most memory accesses are shared with other tasks. >> + * There is no point in continuing fast NUMA scanning, >> + * since other tasks may just move the memory elsewhere. > > With this change, I would expect that with Shared page accesses, > consolidation to take a hit. > >> */ >> - int slot = ps_ratio - NUMA_PERIOD_THRESHOLD; >> + int slot = sp_ratio - NUMA_PERIOD_THRESHOLD; >> if (!slot) >> slot = 1; >> diff = slot * period_slot; >> } else if (lr_ratio >= NUMA_PERIOD_THRESHOLD) { >> /* >> - * Most memory accesses are shared with other tasks. >> - * There is no point in continuing fast NUMA scanning, >> - * since other tasks may just move the memory elsewhere. >> + * Most memory accesses are local. There is no need to >> + * do fast NUMA scanning, since memory is already local. > > Comment wise this make sense. > >> */ >> int slot = lr_ratio - NUMA_PERIOD_THRESHOLD; >> if (!slot) >> @@ -1998,7 +1998,7 @@ static void update_task_scan_period(struct task_struct *p, >> * yet they are not on the local NUMA node. Speed up >> * NUMA scanning to get the memory moved over. >> */ >> - int ratio = max(lr_ratio, ps_ratio); >> + int ratio = max(lr_ratio, sp_ratio); >> diff = -(NUMA_PERIOD_THRESHOLD - ratio) * period_slot; >> } >> >> -- >> 2.20.1 >>