On Thu, 28 Jul 2022, John Stultz wrote: > Using cyclictest without specifying affinity via -a, I was > noticing a strange issue where the rt threads where not > migrating when being blocked. > > After lots of debugging in the kernel, I found its actually an > issue with cyclictest. > > When using -t there is no behavioral difference between specifying > -a or not specifying -a. > > This can be confirmed by adding printf messages around the > pthread_setaffinity_np() call in the threadtest function. > > Currently: > > root@localhost:~/rt-tests# ./cyclictest -t -a -q -D1 > Affining thread 0 to cpu: 0 > Affining thread 1 to cpu: 1 > Affining thread 2 to cpu: 2 > Affining thread 3 to cpu: 3 > Affining thread 4 to cpu: 4 > Affining thread 5 to cpu: 5 > Affining thread 7 to cpu: 7 > Affining thread 6 to cpu: 6 > T: 0 (15034) P: 0 I:1000 C: 1000 Min: 82 Act: 184 Avg: 180 Max: 705 > ... > > root@localhost:~/rt-tests# ./cyclictest -t -q -D1 > Affining thread 0 to cpu: 0 > Affining thread 1 to cpu: 1 > Affining thread 2 to cpu: 2 > Affining thread 3 to cpu: 3 > Affining thread 4 to cpu: 4 > Affining thread 5 to cpu: 5 > Affining thread 6 to cpu: 6 > Affining thread 7 to cpu: 7 > T: 0 (15044) P: 0 I:1000 C: 1000 Min: 74 Act: 144 Avg: 162 Max: 860 > .. > > This issue seems to come from the logic in process_options(): > /* if smp wasn't requested, test for numa automatically */ > if (!smp) { > numa = numa_initialize(); > if (setaffinity == AFFINITY_UNSPECIFIED) > setaffinity = AFFINITY_USEALL; > } > > Here, by setting setaffinity = AFFINITY_USEALL, we effectively > pin each thread to its respective cpu, same as the "-a" option. > > This was most recently introduced in commit bdb8350f1b0b > ("Revert "cyclictest: Use affinity_mask for steering > thread placement""). > > This seems erronious to me, so I wanted to share this patch > which removes the overriding AFFINITY_UNSPECIFIED with > AFFINITY_USEALL by default. Also, some additional tweaks to > preserve the existing numa allocation affinity. > > With this patch, we no longer call pthread_setaffinity_np() in the > "./cyclictest -t -q -D1" case. > > Cc: John Kacur <jkacur@xxxxxxxxxx> > Cc: Connor O'Brien <connoro@xxxxxxxxxx> > Cc: Qais Yousef <qais.yousef@xxxxxxx> > Signed-off-by: John Stultz <jstultz@xxxxxxxxxx> > --- > v2: Fixes error passing cpu = -1 to rt_numa_numa_node_of_cpu() > --- > src/cyclictest/cyclictest.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/src/cyclictest/cyclictest.c b/src/cyclictest/cyclictest.c > index decea78..82759d1 100644 > --- a/src/cyclictest/cyclictest.c > +++ b/src/cyclictest/cyclictest.c > @@ -1270,8 +1270,6 @@ static void process_options(int argc, char *argv[], int max_cpus) > /* if smp wasn't requested, test for numa automatically */ > if (!smp) { > numa = numa_initialize(); > - if (setaffinity == AFFINITY_UNSPECIFIED) > - setaffinity = AFFINITY_USEALL; > } > > if (option_affinity) { > @@ -2043,9 +2041,13 @@ int main(int argc, char **argv) > void *stack; > void *currstk; > size_t stksize; > + int node_cpu = cpu; > + > + if (node_cpu == -1) > + node_cpu = cpu_for_thread_ua(i, max_cpus); > > /* find the memory node associated with the cpu i */ > - node = rt_numa_numa_node_of_cpu(cpu); > + node = rt_numa_numa_node_of_cpu(node_cpu); > > /* get the stack size set for this thread */ > if (pthread_attr_getstack(&attr, &currstk, &stksize)) > @@ -2056,7 +2058,7 @@ int main(int argc, char **argv) > stksize = PTHREAD_STACK_MIN * 2; > > /* allocate memory for a stack on appropriate node */ > - stack = rt_numa_numa_alloc_onnode(stksize, node, cpu); > + stack = rt_numa_numa_alloc_onnode(stksize, node, node_cpu); > > /* touch the stack pages to pre-fault them in */ > memset(stack, 0, stksize); > -- > 2.37.1.455.g008518b4e5-goog > > Still thinking about this, I guess if we go with the above change it will mean that to get meaningful numa behaviour you'll need to specify -a This is what happened, inorder to reduce the alphabet soup of cyclictest options, we decided that if you don't specify --smp, we would test if numa was available and go with that and both implied -a, so we are missing a third option. Another possibility I guess would be to put back --numa, since it's starting to appear that the cure was worse than the disease. At least that's more explicit than using -a to get numa behaviour. Or if -a is unused we could have a --no--affinity Not sure yet. John