On 2/22/23 09:39, Prasad Pandit wrote: > Hello Daniel, > > Thank you so much for your reply, I appreciate it. > > On Wed, 22 Feb 2023 at 17:30, Daniel Bristot de Oliveira <bristot@xxxxxxxxxx <mailto:bristot@xxxxxxxxxx>> wrote: > > This is the timerlat's timer, so it is expected. What this trace is pointing is to > a possible exit from idle latency... so idle tune is required for this system > and *this metric*... but > > > * Idle tune? > > > Yes, that is expected on timerlat in an isolated CPU. But not with osnoise/oslat kind of tool, > as they keep running, while timerlat/cyclictest go to sleep. > > > * I see, okay. > > Let me know how rtla osnoise results are, so I can help more. > > > * Yes, I've been running oslat(1) and rtla-osnoise(1) too. > Please see: > oslat(1) log -> https://0bin.net/paste/T0PDXHz5#AnNEzkTRxQVT1gvAqKM43jW+yhqilbNbFqHIHHpy4MY <https://0bin.net/paste/T0PDXHz5#AnNEzkTRxQVT1gvAqKM43jW+yhqilbNbFqHIHHpy4MY> > rtla-osnoise-top(1) log -> https://0bin.net/paste/8qwjebnZ#22sfTYTv68JAAMHZJhnCBTP-uvP7Mxj8ipAVbuQVsiy <https://0bin.net/paste/8qwjebnZ#22sfTYTv68JAAMHZJhnCBTP-uvP7Mxj8ipAVbuQVsiy> The problem in the oslat case is that trace-cmd is awakened in the isolated CPU. That is probably because trace-cmd once ran and armed a timer there. I recommend you restrict the affinity of trace-cmd to the non-isolated CPUs before starting it and run the experiment again. However, a busy loop in FIFO:95 is not a good setup. That is because you have to raise the priority of other things like the ktimer because of this. Like in your example, ktimer as FIFO:97... it is hard to justify this as a sane setup. In a properly isolated CPU, SCHED_OTHER should be enough. I understand that people use FIFO because it gives the impression that the busy loop will receive more CPU time, but this is biased by tools that only measure the single latency occurrence - and not overall latency. See this article: https://research.redhat.com/blog/article/osnoise-for-fine-tuning-operating-system-noise-in-linux-kernel/ While running with FIFO reduces the "max single noise" by two us (from 7 to 5 us) in relation to the SCHED_OTHER, the total amount of noise that the tool running with FIFO is larger because the starvation of tasks require further checks from the OS side, generating further noise. So SCHED_OTHER is better for total noise. In properly isolated systems, the solution is to try to avoid things on the CPUs, not to starve them. If the system has a job that is pinned to a CPU that cannot be avoided, just let it run. Keeping the system in the starving condition is keeping the system in a faulty state, and the work to take the system out of this situation (like using throttling or stalld) will only cause more noise. -- Daniel