On Thu, Mar 29, 2018 at 07:30:27AM +0000, Phil Edworthy wrote: > Hi John, Clark, > > On 28 March 2018 16:32, Clark Williams wrote: > > On Wed, 28 Mar 2018 16:56:27 +0200 > > John Ogness <john.ogness@xxxxxxxxxxxxx> wrote: > > > > > On 2018-03-28, Phil Edworthy <phil.edworthy@xxxxxxxxxxx> wrote: > > > >> > I found that cyclictest results vary from one run to another. > > > >> > > > > >> > [...] > > > >> > > > > >> > Is it common knowledge that cyclictest results vary so much from > > > >> > one run to another? Any ideas how to mitigate this? > > > >> > > > >> It would be helpful if you provided the command arguments you use > > > >> for your tests. Particularly important options to consider: > > > >> > > > >> -a / --affinity > > > >> -m / --mlockall > > > >> -n / --nanosleep > > > >> -t / --threads > > > >> --secaligned > > > >> > > > >> and of course giving it an appropriate realtime priority: > > > >> > > > >> -p / --priority > > > > > > > > Sure: > > > > cyclictest -m -n -Sp99 -i200 -h300 -M -D 10h > > > > > > I would recommend using prio 98 instead of 99. In general, > > > applications should not be taking the CPU from the migration or > > > watchdog tasks. And usually you want cyclictest to reflect the > > > latencies of real applications. > > > > Agree, please don't use fifo:99. Honestly there's no difference between > > fifo:51 and fifo:98. The interrupt threads default to fifo:50, so you want to be > > above that but no real need to contend with migration, watchdog or posix > > timers. > Ok, I have changed the pri to 98, no difference in the results that I can see. > > I did some overnight tests with 100 runs of cyclictest running for 1 minute. > Stats below were calculated using stats package from http://web.cs.wpi.edu/~claypool/misc/stats/stats.html > > 1. Interval fixed to 400us, not using --secalign > Min: 20 Avg: 37 Max: 187 (avg of 100xMax is 134) > > 2. Interval fixed to 400us, using --secalign > Min: 20 Avg: 37 Max: 177 (avg of 100xMax is 150) > > 3. Interval increases from 400 to 499, not using --secalign > Min: 20 Avg: 37 Max: 211 (avg of 100xMax is 157) > > 4. Interval increases from 400 to 499, using --secalign > Min: 20 Avg: 37 Max: 202 (avg of 100xMax is 157) > > While --secalign may provide more consistent results, it appears that it is > not as good at identifying the worst case latency. > It appears that testing different intervals is much better at identifying the > worst case latency. Have you used the hwlat ftrace tracer or hwlatdetector.py from rt-tests in order to verify if your system have SMI-induced latency spikes? That may not be part of the problem described here, but spurious SMI spikes could account to some of the discrepancies. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html