On 29/04/2021 00:41, Song Bao Hua (Barry Song) wrote: > > >> -----Original Message----- >> From: Dietmar Eggemann [mailto:dietmar.eggemann@xxxxxxx] [...] >>>>> From: Dietmar Eggemann [mailto:dietmar.eggemann@xxxxxxx] >> >> [...] >> >>>>> On 20/04/2021 02:18, Barry Song wrote: [...] > Though we will never go to slow path, wake_wide() will affect want_affine, > so eventually affect the "new_cpu"? yes. > > for_each_domain(cpu, tmp) { > /* > * If both 'cpu' and 'prev_cpu' are part of this domain, > * cpu is a valid SD_WAKE_AFFINE target. > */ > if (want_affine && (tmp->flags & SD_WAKE_AFFINE) && > cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) { > if (cpu != prev_cpu) > new_cpu = wake_affine(tmp, p, cpu, prev_cpu, sync); > > sd = NULL; /* Prefer wake_affine over balance flags */ > break; > } > > if (tmp->flags & sd_flag) > sd = tmp; > else if (!want_affine) > break; > } > > If wake_affine is false, the above won't execute, new_cpu(target) will > always be "prev_cpu"? so when task size > cluster size in wake_wide(), > this means we won't pull the wakee to the cluster of waker? It seems > sensible. What is `task size` here? The criterion is `!(slave < factor || master < slave * factor)` or `slave >= factor && master >= slave * factor` to wake wide. I see that since you effectively change the sched domain size from LLC to CLUSTER (e.g. 24->6) for wakeups with cpu and prev_cpu sharing LLC (hence the `numactl -N 0` in your workload), wake_wide() has to take CLUSTER size into consideration. I was wondering if you saw wake_wide() returning 1 with your use cases: numactl -N 0 /usr/lib/lmbench/bin/stream -P [6,12] -M 1024M -N 5