Re: Regression on rt kernel while using POSIX timers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tracy,

The tests below include '-n' argument which mean they will use
clock_nanosleep. I don't face any problems with this scenario. I face
problems when I do not provide the '-n' argument and POSIX timers are
used. But, for your information, I see average of 10us and max 18us
when I am running the tests that you provided. Please note that this is
without any workloads and these tests are performed on v4.1.30-rt34 on 
http://www.supermicro.com/products/motherboard/Atom/X10/A1SRi-2758F.cfm
.


As for running cyclictest with posix timers, I see the following
results with the same hardware and kernel:

Case 1:
~/rt-tests# ./cyclictest -t1 -p 80 -i 500 -l 100000   
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.01 0.02 0.00 2/204 6936

T: 0 ( 6926) P:80 I:500 C: 100000 Min:     11 Act:   16 Avg:   15 Max:      65


Case 2:
~/rt-tests# ./cyclictest -a 3 -t1 -p 80 -i 500 -l 100000
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.15 0.04 0.01 1/204 6954           

T: 0 ( 6943) P:80 I:500 C: 100000 Min:     31 Act:   45 Avg:   43 Max:      59

Note that the first test is without any cpu affinity and the second
test by pinning down  the cyclictest process to a CPU by providing the
'-a' argument.

I also tried tracing cyclictest using trace-cmd by enabling
hrtimer_start, sched_switch, sched_migrate_task, signal_generate and
grepping for cyclictest in the logs. I see the following transitions in
both the cases:

Case 1:
	TASK		CPU 	TIMESTAMP	EVENT		 
      cyclictest-2351  [003]  7773.130964: hrtimer_start:        hrtimer=0xffff88046b098828 function=posix_timer_fn/0x0 expires=7767043047952 softexpires=7767043047952
      cyclictest-2351  [003]  7773.130978: sched_switch:         cyclictest:2351 [19] S ==> ktimersoftd/3:36 [98]
   ktimersoftd/3-36    [003]  7773.130992: sched_switch:         ktimersoftd/3:36 [98] R ==> cyclictest:2351 [19]
      cyclictest-2351  [003]  7773.131003: sched_switch:         cyclictest:2351 [19] D ==> ktimersoftd/3:36 [19] <--- cyclictest is in TASK_UNINTERRUPTIBLE state.
   ktimersoftd/3-36    [003]  7773.131006: signal_generate:      sig=14 errno=0 code=131070 comm=cyclictest pid=2351 grp=0 res=0
   ktimersoftd/3-36    [003]  7773.131018: sched_switch:         ktimersoftd/3:36 [98] R ==> cyclictest:2351 [19]
      cyclictest-2351  [003]  7773.131031: sched_switch:         cyclictest:2351 [19] D ==> ktimersoftd/3:36 [19]
   ktimersoftd/3-36    [003]  7773.131046: sched_switch:         ktimersoftd/3:36 [98] R ==> cyclictest:2351 [19]

Case 2:
	TASK		CPU 	TIMESTAMP	EVENT		 
       cyclictest-2386  [004]  7833.312194: hrtimer_start:        hrtimer=0xffff880463a78198 function=posix_timer_fn/0x0 expires=7827177394186 softexpires=7827177394186
      cyclictest-2386  [004]  7833.312204: sched_switch:         cyclictest:2386 [19] S ==> swapper/4:0 [120]
   ktimersoftd/4-46    [004]  7833.312285: sched_migrate_task:   comm=cyclictest pid=2386 prio=19 orig_cpu=4 dest_cpu=5
   ktimersoftd/4-46    [004]  7833.312288: signal_generate:      sig=14 errno=0 code=131070 comm=cyclictest pid=2386 grp=0 res=0
          <idle>-0     [005]  7833.312289: sched_switch:         swapper/5:0 [120] R ==> cyclictest:2386 [19]
      cyclictest-2386  [005]  7833.312294: hrtimer_start:        hrtimer=0xffff880463a78198 function=posix_timer_fn/0x0 expires=7827177494186 softexpires=7827177494186
      cyclictest-2386  [005]  7833.312303: sched_switch:         cyclictest:2386 [19] S ==> swapper/5:0 [120]
   ktimersoftd/5-53    [005]  7833.312385: sched_migrate_task:   comm=cyclictest pid=2386 prio=19 orig_cpu=5 dest_cpu=4
   ktimersoftd/5-53    [005]  7833.312388: signal_generate:      sig=14 errno=0 code=131070 comm=cyclictest pid=2386 grp=0 res=0
          <idle>-0     [004]  7833.312390: sched_switch:         swapper/4:0 [120] R ==> cyclictest:2386 [19]
(Note: the above events indicate 2 loops of cyclictest.)

The above cases behave according to what Sebastian said in the previous
email.

In Case 1,  I am not sure why the 3 context switches are occuring
before the signal_generate. Also, why is cyclictest is in
TASK_UNINTERRUPTIBLE state (I deduced it by cross-referencing the 'D'
in include/linux/sched.h. Hope I am right.). Also, ktimersoftd seems to
inherit the priority of cyclictest. Is it to run the timer?

Case 2 is much less confusing and I think I am able to follow it.

Thanks,
Vedang

On Wed, 2017-03-01 at 13:03 -0600, Tracy Smith wrote:
> Pleaee isolate the specific release where latency regression occurred
> if the community has not already done so. Possibly backing out
> changes specific to timers and posix threads for that release to see
> what impacted latency.  This is crucial for NXP and others as they
> migrate to later releases of the kernel.
> 
> Also please share your latency numbers while testing for regression
> for each release if possible.
> 
> Using:
> 
> taskset -c 2 cyclictest -m -n -P -p 99 -t 1 -i 10000-136000000 
> taskset -c 3 cyclictest -m -n -P -p 99 -t  1 -i 10000000 -l 0
> 
> On an ARM-8 Cortex-A53 NXP LS1043ARDB 4x core the latency was approx
> 3 us min/avg and max 7 with no latency spikes on 4.1.30-rt34+g667e6ba 
> SMP PREEMPT RT NXP SDK 2.0-1609.  Used sar, iperf, and stress for
> load.
> 
> Tracy Smith
> Software Engineer
> DSC
> 
> 
> 
> > 
> > On Mar 1, 2017, at 9:22 AM, "bigeasy@xxxxxxxxxxxxx"
> > <bigeasy@xxxxxxxxxxxxx> wrote:
> > 
> > > 
> > > On 2017-02-22 01:43:09 [+0000], Patel, Vedang wrote:
> > > Also, I checked the number of cpu migrations and context switches
> > > using
> > > perf stat. Following are the results for both the cases:
> > > 
> > > cyclictest not pinned to CPU: the number of CPU migrations and
> > > context
> > > switches are equal to the number of loops I am running for
> > > cyclictest.
> > > Here, there are a lot of context switches with the ktimersoftd
> > > process.
> > > 
> > > cyclictest pinned to a CPU: there are very few CPU migrations.
> > > But, the
> > > number of context switches are approximately 3 times the number
> > > of
> > > loops. Most of the context switches are with the swapper (idle)
> > > process.
> > > 
> > > In both the cases, the number of page faults are pretty low
> > > (~65).
> > > 
> > > I also tried similar experiments with mainline kernel (v4.4.47).
> > > The
> > > number of CPU migrations were pretty low (single digits) for both
> > > the
> > > cases described above. Also, the number of context switches were
> > > equal
> > > to the number of loops provided as an argument to cyclictest.
> > The hardware interrupt wakes the "timer-softirq" thread. This RT-
> > thread
> > in turn will wake up your application / cyclictest which uses the
> > posix-timers. So the priority of this thread is also important.
> > After
> > the wakeup it is possible that the newer kernel tries migrate the
> > RT
> > task if possible in order to keep it running as soon as possible.
> > If it
> > does not migrate it, then either the cyclictest task has to wait or
> > the
> > "timer-softirq" thread.
> > 
> > > 
> > > Thanks,
> > > Vedang Patel
> > > Software Engineer
> > > Intel Corporation
> > Sebastian
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rt-
> > users" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html��.n��������+%����;��w��{.n�����{�����ǫ���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux