Re: [PATCH v2] cyclictest: Fix threads being affined even when -a isn't set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 11 Aug 2022, John Stultz wrote:

> On Thu, Aug 11, 2022 at 11:57 AM John Stultz <jstultz@xxxxxxxxxx> wrote:
> >
> > On Thu, Aug 11, 2022 at 10:57 AM John Kacur <jkacur@xxxxxxxxxx> wrote:
> > > On Thu, 28 Jul 2022, John Stultz wrote:
> > >
> > > > Using cyclictest without specifying affinity via -a, I was
> > > > noticing a strange issue where the rt threads where not
> > > > migrating when being blocked.
> > > >
> > > > After lots of debugging in the kernel, I found its actually an
> > > > issue with cyclictest.
> > > >
> > > > When using -t there is no behavioral difference between specifying
> > > > -a or not specifying -a.
> > > >
> > > > This can be confirmed by adding printf messages around the
> > > > pthread_setaffinity_np() call in the threadtest function.
> > > >
> > > > Currently:
> > > >
> > > > root@localhost:~/rt-tests# ./cyclictest -t -a -q -D1
> > > > Affining thread 0 to cpu: 0
> > > > Affining thread 1 to cpu: 1
> > > > Affining thread 2 to cpu: 2
> > > > Affining thread 3 to cpu: 3
> > > > Affining thread 4 to cpu: 4
> > > > Affining thread 5 to cpu: 5
> > > > Affining thread 7 to cpu: 7
> > > > Affining thread 6 to cpu: 6
> > > > T: 0 (15034) P: 0 I:1000 C:   1000 Min:     82 Act:  184 Avg:  180 Max: 705
> > > > ...
> > > >
> > > > root@localhost:~/rt-tests# ./cyclictest -t -q -D1
> > > > Affining thread 0 to cpu: 0
> > > > Affining thread 1 to cpu: 1
> > > > Affining thread 2 to cpu: 2
> > > > Affining thread 3 to cpu: 3
> > > > Affining thread 4 to cpu: 4
> > > > Affining thread 5 to cpu: 5
> > > > Affining thread 6 to cpu: 6
> > > > Affining thread 7 to cpu: 7
> > > > T: 0 (15044) P: 0 I:1000 C:   1000 Min:     74 Act:  144 Avg:  162 Max: 860
> > > > ..
> > > >
> > > > This issue seems to come from the logic in process_options():
> > > >       /* if smp wasn't requested, test for numa automatically */
> > > >       if (!smp) {
> > > >               numa = numa_initialize();
> > > >               if (setaffinity == AFFINITY_UNSPECIFIED)
> > > >                       setaffinity = AFFINITY_USEALL;
> > > >         }
> > > >
> > > > Here, by setting setaffinity = AFFINITY_USEALL, we effectively
> > > > pin each thread to its respective cpu, same as the "-a" option.
> > > >
> > > > This was most recently introduced in commit bdb8350f1b0b
> > > > ("Revert "cyclictest: Use affinity_mask for steering
> > > > thread placement"").
> > > >
> > > > This seems erronious to me, so I wanted to share this patch
> > > > which removes the overriding AFFINITY_UNSPECIFIED with
> > > > AFFINITY_USEALL by default. Also, some additional tweaks to
> > > > preserve the existing numa allocation affinity.
> > > >
> > > > With this patch, we no longer call pthread_setaffinity_np() in the
> > > > "./cyclictest -t -q -D1"  case.
> > > >
> > > > Cc: John Kacur <jkacur@xxxxxxxxxx>
> > > > Cc: Connor O'Brien <connoro@xxxxxxxxxx>
> > > > Cc: Qais Yousef <qais.yousef@xxxxxxx>
> > > > Signed-off-by: John Stultz <jstultz@xxxxxxxxxx>
> > > > ---
> > > > v2: Fixes error passing cpu = -1 to rt_numa_numa_node_of_cpu()
> > > > ---
> > > >  src/cyclictest/cyclictest.c | 10 ++++++----
> > > >  1 file changed, 6 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/src/cyclictest/cyclictest.c b/src/cyclictest/cyclictest.c
> > > > index decea78..82759d1 100644
> > > > --- a/src/cyclictest/cyclictest.c
> > > > +++ b/src/cyclictest/cyclictest.c
> > > > @@ -1270,8 +1270,6 @@ static void process_options(int argc, char *argv[], int max_cpus)
> > > >       /* if smp wasn't requested, test for numa automatically */
> > > >       if (!smp) {
> > > >               numa = numa_initialize();
> > > > -             if (setaffinity == AFFINITY_UNSPECIFIED)
> > > > -                     setaffinity = AFFINITY_USEALL;
> > > >       }
> > > >
> > > >       if (option_affinity) {
> > > > @@ -2043,9 +2041,13 @@ int main(int argc, char **argv)
> > > >                       void *stack;
> > > >                       void *currstk;
> > > >                       size_t stksize;
> > > > +                     int node_cpu = cpu;
> > > > +
> > > > +                     if (node_cpu == -1)
> > > > +                             node_cpu = cpu_for_thread_ua(i, max_cpus);
> > > >
> > > >                       /* find the memory node associated with the cpu i */
> > > > -                     node = rt_numa_numa_node_of_cpu(cpu);
> > > > +                     node = rt_numa_numa_node_of_cpu(node_cpu);
> > > >
> > > >                       /* get the stack size set for this thread */
> > > >                       if (pthread_attr_getstack(&attr, &currstk, &stksize))
> > > > @@ -2056,7 +2058,7 @@ int main(int argc, char **argv)
> > > >                               stksize = PTHREAD_STACK_MIN * 2;
> > > >
> > > >                       /*  allocate memory for a stack on appropriate node */
> > > > -                     stack = rt_numa_numa_alloc_onnode(stksize, node, cpu);
> > > > +                     stack = rt_numa_numa_alloc_onnode(stksize, node, node_cpu);
> > > >
> > > >                       /* touch the stack pages to pre-fault them in */
> > > >                       memset(stack, 0, stksize);
> > > > --
> > > > 2.37.1.455.g008518b4e5-goog
> > > >
> > > >
> > >
> > > Still thinking about this, I guess if we go with the above change it will
> > > mean that to get meaningful numa behaviour you'll need to specify -a
> >
> > So, I guess what exactly do you mean when you say "meaningful numa behavior"?
> >
> > My instinct is you want the memory for the thread running allocated on
> > the numa node its running on.
> > Ideally the kernel should take care of this automatically (doing local
> > node allocations and preferring the tasks' local node in the
> > scheduler), but I guess if you are trying to setup the initial thread
> > distribution across all the nodes, that's a bit hard to do w/o setting
> > the affinity.  So I sort of expect that if you're wanting to align
> > allocations to numa nodes, you probably want to either explicitly set
> > the affinity, or not give any guidance and let the kernel make its
> > best choice.
> >
> > I guess there's a fairly recent SCHED_SOFT_AFFINITY, which might
> > provide a soft affinity for numa?
> >
> > > This is what happened, inorder to reduce the alphabet soup of cyclictest
> > > options, we decided that if you don't specify --smp, we would test if
> > > numa was available and go with that and both implied -a, so we are missing
> > > a third option.
> > >
> > > Another possibility I guess would be to put back --numa, since it's
> > > starting to appear that the cure was worse than the disease. At least
> > > that's more explicit than using -a to get numa behaviour.
> >
> > This seems more intuitive to me.   Maybe --numa w/o -a would only
> > affine the cpumask to nodes, rather than cpus?
> 
> Thinking about this for just a second more.  This shouldn't be too
> hard. Just have to get a node->cpumask mapping, rather than affining
> thread x -> cpu x, you would just affine to
> node_mask(node(thread_id)).
> 
> thanks
> -john
> 

I had time to have a closer look at this, and it works fine, and isn't
causing any problems for rteval for example, which is explicitly
setting -a anyway when it calls cyclictest

Thanks for your patch!

Signed-off-by: John Kacur <jkacur@xxxxxxxxxx>




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux