Hello, Thank you for helping :) >> I am an engineering student and I try to proof that a 4000Hz hard real-time >> application can run on an ARM board rather than on a more powerful machine. >> >> I work with an IMX6 dual-core and PREEMPT_RT patch-4.1.38-rt46. >> I expected that my 4000Hz thread will perform better if it is the only one >> on core1, so I put the boot argument isolcpus=1 and bound my thread to cpu1. >> >> With the isolcpus=1, note that it remains these processes on core1: >> >> PID PSR RTPRIO CMD >> 16 1 99 [migration/1] >> 17 1 - [rcuc/1] >> 18 1 1 [ktimersoftd/1] >> 19 1 - [ksoftirqd/1] >> 20 1 99 [posixcputmr/1] >> 21 1 - [kworker/1:0] >> 22 1 - [kworker/1:0H] >> >> I tried several permutations in my kernel configuration and boot args >> (rcu_nocbs is an example) and none affected the results I describe below. > > In general with isolcpus you should be able to get most task off CPU1. > The rcu_nocbs option should get the RCU callbacks off CPU1. And then you > can change the IRQ affinity to CPU0. But this looks good. As I said, in my previous message, I deactivated such options as I was not sure and as I did not observe any enhancement when activating them. I agree it is what I want so I reactivated them and my kernel config looks like: # # Timers subsystem # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set # CONFIG_NO_HZ_IDLE is not set CONFIG_NO_HZ_FULL=y CONFIG_NO_HZ_FULL_ALL=y # CONFIG_NO_HZ_FULL_SYSIDLE is not set CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y # # RCU Subsystem # ... CONFIG_RCU_NOCB_CPU=y # CONFIG_RCU_NOCB_CPU_NONE is not set # CONFIG_RCU_NOCB_CPU_ZERO is not set CONFIG_RCU_NOCB_CPU_ALL=y # # CPU Frequency scaling # CONFIG_CPU_FREQ=y # CONFIG_CPU_FREQ_STAT is not set CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set ... CONFIG_CPU_FREQ_GOV_PERFORMANCE=y # CONFIG_CPU_FREQ_GOV_POWERSAVE is not set ... CONFIG_ARM_IMX6Q_CPUFREQ=y ... # # CPU Idle # # CONFIG_CPU_IDLE is not set # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set # # Kernel Features # CONFIG_HZ_FIXED=0 CONFIG_HZ_100=y ... CONFIG_HZ=100 >> I use a script to stress Linux. I expected that only cpu0 will be stressed >> as cpu1 is isolated. But it has an impact on thread on cpu1 too. >> I think it is normal. > > The CPU-caches might be shared. Locks which which are held by CPU0 and > required by CPU1 will also slow down CPU1. > The sched_switch tracer should show you if anything of your "script to > stress" migrates to CPU1 and/or if the "delays" CPU1. That would make > your task on CPU1 go from state R to D and back to R. Thank you, I never used Linux kernel tracing before. It was not enabled in my Kernel, I just did it. I have some results below. >> First, as I draw it (in red) on “expected_behavior.png”, I expected much less >> variations in the Latency and especially the Execution time. >> (My thread always does the same thing). > > You could try cyclictest and compare its latency. However from the plot > it looks at about 25us so it is not that bad. > >> How can we explain so much time variations? As I said, I tried to deactivate >> all interrupts on cpu1 (rcu and others processes above) but I am not very >> familiar with that. >> > As I suggested above, run sched_switch, measure the latency and if you > hit an execution time of 150us then tell your application to disable the > trace (a write '0' to the tracing_on sysfs file would do it) and then > you can look at the trace and see if got interrupted by anything. > >> Then, I am even more surprised when, trying to debug that, I decided to put >> another thread on core0 and it improved the behavior of the thread on core1! > > If you put more load on the system and the numbers improve then this > might be because the caches are filled with the "right" data. Also it > might prevent power management from doing its job. About power management, as you can see in the kernel conf, *it is enabled* but *only the performance mode* is enabled. Indeed, I already worked on that, I first deactivated it but then my clock was not at its maximum. I succeeded in getting the max clock frequency back by doing this. >> My application looks like: Thanks for reading the code :) I already checked if next <= now, look how I did it in the updated code. I changed to CLOCK_MONOTONIC, thanks. I try to do error checking! > … >> thread1(){ >> >> struct timespec start, stop, next, interval = 250us; >> >> /* Initialization of the periodicity */ >> clock_gettime(CLOCK_MONOTONIC, &next); >> >> /* as time basis, try not to use a random time but start with usec = 25 * >> * or 50 or so. You should be able to avoid the HZ timer. */ >> next.tv_sec += 1; >> next.tv_nsec = 250000; Is it what you expected I did? >> next += interval; >> >> while(1){ >> /*Releases at specified rate*/ >> if(clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &next, NULL) != 0){ >> perror("error"); >> exit(-1); >> } >> /*Get time to check jitter and execution time*/ >> if(clock_gettime(CLOCK_MONOTONIC, &start) != 0){ >> perror("error"); >> exit(-1); >> } >> do_job(); >> /*Get time to check execution time*/ >> if(clock_gettime(CLOCK_MONOTONIC, &stop) != 0){ >> perror("error"); >> exit(-1); >> } >> do_stat(); //jitter = start-next; exec_time = stop-start >> while(next <= stop){ >> next += interval; >> } >> } >> >> } >> >> >> thread0(){ >> struct timespec next, interval = 250us; >> >> /* Initialization of the periodicity */ >> clock_gettime(CLOCK_REALTIME, &next); >> next += interval; >> >> while(1){ >> /*Releases at specified rate*/ >> clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, &next, NULL); >> usleep(100); >> /**************************************************************** >> * Without sleeping 100us, only the Latency of the other thread * >> * (on cpu1) is improved. * >> * Sleeping 100us in this new 4000Hz thread (cpu0) improved * >> * the execution time of the other thread (on cpu1)... * >> ****************************************************************/ > > I'm not sure but it looks like the usleep() is implemented as a > busy loop. A busy loop of 100us isn't *that* bad but if all RT tasks > take around 950us then the scheduler will prevent all RT tasks from > running (in order to prevent a possible lockup). > >> next += interval; >> } >> >> } >> >> As you can see in “background_thread_on_core_0.png”, the Latency and the >> Execution time (of the thread on core1) are improved (in comparison with >> “no_background_thread.png”) when there is a new 4000Hz thread on cpu0 >> AND when this thread does something... >> >> I tried a lot of permutations and I do not understand: >> - If the new thread (cpu0) is at 5000Hz (>4000Hz), then observations >> are the same (performance of the thread on cpu1 improves) >> - If the new thread is at 2000HZ (<4000Hz), then there is no improvement... >> >> - If the new thread (4000Hz on cpu0) does something (even sleeping enough >> time), then the Execution time of the thread on cpu1 improves. >> - If the new thread does nothing (or do too few stuff), then, ONLY the >> Latency of the thread on cpu1 is improved... >> >> Do you have any experience with that, any idea to debug? > > - tracing to see if the scheduler puts another task on while your RT is > running > - tracing to see if an interrupts fires which prevents your task from > running. This should only be the timer interrupt since everything else > should be only on CPU0. I analyse a trace below. There is only the ktimersoftd/1 task which comes back every 10000us. I first had CONFIG_HZ=1000 so it was every 1000us then. I first thought it was responsible for my problems. But finally it seems not. I am not sure. > - check if any power management is on and try to disable it. As I said above, it is *on* but *only perf mode is allowed*. Is it correct? > - as time basis, try not to use a random time but start with usec = 25 > or 50 or so. You should be able to avoid the HZ timer. I tried to start with usec = 25. Is it correct in the code above? Do you talk about the CONFIG_NO_HZ option? I am confident it is what I want, but, according to the documentation, I am afraid: "POSIX CPU timers prevent CPUs from entering adaptive-tick mode. Real-time applications needing to take actions based on CPU time consumption need to use other means of doing so." Indeed, I want to continue to use the POSIX CPU timer (sleep and gettime). What's more, this test from Frederic fails: https://kernel.googlesource.com/pub/scm/linux/kernel/git/frederic/dynticks-testing/+/9b39ab06956d484d5311e0678e1fba8104a8e2e3 According to this message, I can not stop tick: user_loop-15296 [001] dn..... 8131.952509: tick_stop: success=no msg=more than 1 task in runqueue Moreover, I do not see this message when tracing my application below. I attached the trace (trace.txt) to this message. The trace stopped when exec_time was more than 200us. I "normalized" (put to 0) the timestamp two times to understand what is going on. What I understand: Lines 5-8: It is the clock_nanosleep handler. It takes 13us to wake up the thread. Good. Line 9: The thread took 60us to run. Very good. Line 10: Let's start again 250us later. Good. Line 11: tick_sched_timer... It will wake up the ktimersoftd/1. I am afraid :( Line 15: It tooks 35us to wake up the thread... Not so bad, sometimes it is more... Line 16: The thread took 68us to run. Ok. Lines 17-19: ktimersoftd/1 took 17us to run (370-353). I think it is not so embarrassing. I am reassured :) ... Line 69: 1000us, ok Line 74: 1256us, 6us late :( Line 77: 36us to wake up the thread, +6us = 42us jitter :( :( Line 78: ***255us (1547-1292) execution time !!!*** Line 79: Overrun, next rdv at 1750us, 8us late... Does this trace help? According to me, there is nothing on CPU1. There is just this ktimersoftd/1. Can the idle task (on cpu0) be delayed by the stress on cpu0? Can we migrate the Linux scheduler to cpu1? Thanks again, Yann
. . . <idle>-0 [001] d..h1.. 0 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 2 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864569754886 <idle>-0 [001] dn.h1.. 7 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 13 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 73 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 250 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 253 : hrtimer_expire_entry: hrtimer=bf7af990 function=tick_sched_timer now=1864570005553 <idle>-0 [001] d..h1.. 259 : softirq_raise: vec=1 [action=TIMER] <idle>-0 [001] d..h1.. 263 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864570005553 <idle>-0 [001] dn.h1.. 268 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 285 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 353 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=ktimersoftd/1 next_pid=21 next_prio=98 ktimersoftd/1-21 [001] .....11 361 : softirq_entry: vec=1 [action=TIMER] ktimersoftd/1-21 [001] .....11 364 : softirq_exit: vec=1 [action=TIMER] ktimersoftd/1-21 [001] d...2.. 370 : sched_switch: prev_comm=ktimersoftd/1 prev_pid=21 prev_prio=98 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 501 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 505 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864570257220 <idle>-0 [001] dn.h1.. 512 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 520 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 605 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 750 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 754 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864570506220 <idle>-0 [001] dn.h1.. 760 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 767 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 841 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 1000 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 1004 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864570755220 <idle>-0 [001] dn.h1.. 1010 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 1016 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 1088 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 1250 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 1254 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864571005886 <idle>-0 [001] dn.h1.. 1259 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 1265 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 . . . <idle>-0 [001] d..h1.. 0 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 3 : hrtimer_expire_entry: hrtimer=bf7af990 function=tick_sched_timer now=1864580005886 <idle>-0 [001] d..h1.. 10 : softirq_raise: vec=1 [action=TIMER] <idle>-0 [001] d..h1.. 15 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864580005886 <idle>-0 [001] dn.h1.. 21 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 41 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 123 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=ktimersoftd/1 next_pid=21 next_prio=98 ktimersoftd/1-21 [001] .....11 135 : softirq_entry: vec=1 [action=TIMER] ktimersoftd/1-21 [001] .....11 138 : softirq_exit: vec=1 [action=TIMER] ktimersoftd/1-21 [001] d...2.. 147 : sched_switch: prev_comm=ktimersoftd/1 prev_pid=21 prev_prio=98 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 251 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 255 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864580256553 <idle>-0 [001] dn.h1.. 260 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 267 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 359 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 500 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 504 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864580506220 <idle>-0 [001] dn.h1.. 510 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 519 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 610 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 750 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 754 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864580755886 <idle>-0 [001] dn.h1.. 760 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 769 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 856 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 1000 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 1004 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864581005886 <idle>-0 [001] dn.h1.. 1010 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 1017 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 1110 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 1256 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 1265 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864581264220 <idle>-0 [001] dn.h1.. 1277 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 1292 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 test_preempt-5583 [001] d...2.. 1547 : sched_switch: prev_comm=test_preempt prev_pid=5583 prev_prio=1 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120 <idle>-0 [001] d..h1.. 1758 : irq_handler_entry: irq=16 name=twd <idle>-0 [001] d..h1.. 1770 : hrtimer_expire_entry: hrtimer=98117ef0 function=hrtimer_wakeup now=1864581767220 <idle>-0 [001] dn.h1.. 1788 : irq_handler_exit: irq=16 ret=handled <idle>-0 [001] d...2.. 1808 : sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=test_preempt next_pid=5583 next_prio=1 exec_time > 200 : echo 0 > tracing_on