Re: cyclictest: possible bugs in CPU enumeration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/11/18 3:47 PM, Clark Williams wrote:
> Comments below.
> 
> BTW Ralf, I do appreciate the work. Sorry for being slow responding but for
> some reason my manager expects me to do work... :)

Not in a hurry, my workarounds still do a good job. ;-)

> 
> On Fri, 28 Sep 2018 15:25:17 +0200
> Ralf Ramsauer <ralf.ramsauer@xxxxxxxxxxxxxxxxx> wrote:
> 
>> Hi,
>>
>> I think I might have trapped into some bugs in cyclictest's CPU enumeration.
>>
>> a) If cyclictest is parameterised with "-a 0,2", then in
>> process_options() we will run into:
>>
>>     [...]
>>     if (optarg != NULL) {
>>             parse_cpumask(optarg, max_cpus);
>>             setaffinity = AFFINITY_SPECIFIED;
>>     } else if (optind<argc && atoi(argv[optind])) {
>>>                             ^^^^
>>>                             |- atoi will be 0, and this case fails  
>>             parse_cpumask(argv[optind], max_cpus);
>>             setaffinity = AFFINITY_SPECIFIED;
>>     } else {
>>             setaffinity = AFFINITY_USEALL;
>>     }
>>
>> Though we should, we won't enter this case and end up with
>> AFFINITY_USEALL. So I have to reverse the argument to "-a 2,0" as a
>> quick workaround.
>>
>> Isn't the atoi() check superfluous? We're not relying on its result in
>> any case.
>>
>>
>> b) I boot with nosmt on a 2 core / 2 threads x86 machine where I can not
>> disable SMT via firmware. nosmt results in offlining SMT siblings:
>>     # cat /sys/devices/system/cpu/o{ff,n}line
>>     1,3-7
>>     0,2
>>
>> As mentioned before, I have to reverse the CPU mask, so I run
>>     # ./cyclictest -p 99 -m -t 2 --affinity 2,0
>>     Bug in cpu mask handling code.

We could exit() in case we hit an implementation bug. It took me some
time to notice that the affinity of cyclictest's threads is actually
wrong, as I missed the output.

>>
>> This bug is hidden in max_cpus. For getting max_cpu, cyclictest uses:
>>         int max_cpus = sysconf(_SC_NPROCESSORS_ONLN);
>>
>> But if some CPUs are offlined, then NRPROC_ONLN !=
>> max_cpus_available... Later, when cyclictest tries to assign threads
>> to CPUs, max_cpus is the highest CPU# that will be considered in
>> enumeration. I think that's wrong, if some CPUs are offline.
>>
>> The dirty steam hammer quickfix of hard-coding max_cpus = 4 works for
>> my setup atm.
>>
>> A fix could be to somehow retrieve the number of present CPUs. Is
>> there some API available, besides parsing sysfs?
> 
> 
> Ah, I see. We may need to change that call to be:
> 
> 	int max_cpus = sysconf(_SC_NPROCESSORS_CONF);
> 
> That way we'll have the correct max value, but that means we need to insert
> an online check for each cpu when we get passed a cpulist.

Yeah, that's a better candidate -- but wouldn't it be better to have a
mask of online CPUs? Then we wouldn't have to check the status of every
CPU we hit, and it implicitly contains max_cpus.

> 
> I'm torn on whether we should just punt if we get passed an offline cpu or
> whether we should just drop it from the list and run with the ones that actually
> are online. Thoughts?

If an user tries to set the affinity to a CPU that is offline, then he's
doing something wrong. Instead of skipping offlined CPUs, though being
specified, I think it's better to error+exit. Everything else might lead
to some results of cyclictest that might be misinterpreted (as it
happened to me :) ).

It might be done as follows:
  1) Get cpu_set of online CPUs
  2) Get cpu_set of desired target CPUs (affinity mask)
  3) if ((online & desired) != desired) fail();
  4) Starting from LSB, round-robin set the CPU for the thread

This is probably easier than the current cpu_for_thread() mechanism, but
it won't respect extraordinary arguments like "-a 3-5,0 -t 5", where
thread 1 and 5 are run on CPU 3 (as mentioned in the manpage). The
question is, if there are real-world scenarios where this complexity is
actually required.

> 
>>
>>
>> c) On a 4 core arm64 system, where numa.h is not available, cyclictest
>> uses its own implementation in rt_numa.h. In my configuration, all
>> available cores are up and running. Let's say I want to spawn
>> three threads allocated on the uppermost three CPUS:
>>   # cyclictest -p 99 -m -a 1-3 -t 3 -v
>>
>> Though -a is provided, cyclictest will move all threads to CPU 1:
>>     Max CPUs = 4
>>     # /dev/cpu_dma_latency set to 0us
>>     Thread 0 Interval: 1500
>>     Thread 0 using cpu 1.
>>     Thread 1 Interval: 2000
>>     Thread 1 using cpu 1.
>>     Thread 2 Interval: 2500
>>     Thread 2 using cpu 1.
>>
>> Digging a beet deeper, I think rt_numa.h's rt_numa_parse_cpustring()
>> is broken (or incomplete?) in rt_numa.h:230:
>>     cpu = atoi(s);
>>
>> where s = "1-3". Of course, this will fail. AFAICT,
>> rt_numa_parse_cpustring() currently doesn't support cpu strings like
>> "1-3" and silently fails.
>>
>> Again, my dirty workaround to get things running is to hardcode
>> enumeration.
>>
>>
>> I'm not aware of the history of cyclictest, and maybe there are some
>> reasons for this behavior. If so, please let me know!
>>
>> IMHO, at least bug a) should be easy to fix. I tried to implement a
>> proper fix for b) and c), but I have to admit that CPU enumeration is
>> a beast.
>>
> 
> Yeah, the parsing of cpu ranges is a PITA, no question. I'm not surprised
> that we don't handle ranges properly. We just need to take a harder look
> a the parsing code. 

... that maybe could be copied from the kernel. Or try to outsource
parsing to numactl (which seems to do a good job). But I'm not sure if
numactl is supported on all target architectures.

> 
> Could I trouble you to open a BZ on bugzilla.kernel.org for cyclictest?

Sure, if it helps. Which category? I'll simply c&p parts of my initial
report then.

Thanks,
  Ralf

> 
> Clark
> 
> 
> 



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux