On 6/8/21 3:36 AM, Steven Rostedt wrote: > On Fri, 14 May 2021 22:51:18 +0200 > Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> wrote: [...] >> It is possible to follow the trace by reading the trace trace file:: > > Do not need rst markup in commit logs ;-) > >> >> [root@f32 tracing]# cat trace >> # tracer: timerlat >> # >> # _-----=> irqs-off >> # / _----=> need-resched >> # | / _---=> hardirq/softirq >> # || / _--=> preempt-depth >> # || / >> # |||| ACTIVATION >> # TASK-PID CPU# |||| TIMESTAMP ID CONTEXT LATENCY >> # | | | |||| | | | | >> <idle>-0 [000] d.h1 54.029328: #1 context irq timer_latency 932 ns >> <...>-867 [000] .... 54.029339: #1 context thread timer_latency 11700 ns >> <idle>-0 [001] dNh1 54.029346: #1 context irq timer_latency 2833 ns >> <...>-868 [001] .... 54.029353: #1 context thread timer_latency 9820 ns >> <idle>-0 [000] d.h1 54.030328: #2 context irq timer_latency 769 ns >> <...>-867 [000] .... 54.030330: #2 context thread timer_latency 3070 ns >> <idle>-0 [001] d.h1 54.030344: #2 context irq timer_latency 935 ns >> <...>-868 [001] .... 54.030347: #2 context thread timer_latency 4351 ns >> >> The tracer creates a per-cpu kernel thread with real-time priority that >> prints two lines at every activation. The first is the *timer latency* >> observed at the *hardirq* context before the activation of the thread. >> The second is the *timer latency* observed by the thread, which is the >> same level that cyclictest reports. The ACTIVATION ID field > > The above is misleading. Below, I see that you state that the values are > "net values" where the thread latency does not include the irq latency. > This is not the same as cyclictest. (I had to update my ASCII art below > after reading the below statement). Replying here all the comments about the timerlat/cyclictest timeline, and gross/net values. So, yeah, my description was not clear enough. The values that are net are those reported by the *osnoise: events* only. The *timerlat tracer* values are not discounted, and that is why they are similar to the value reported by cyclictest (ok, cyclictest still captures the exit to user overhead and friends). >> serves to relate the *irq* execution to its respective *thread* execution. >> >> The irq/thread splitting is important to clarify at which context >> the unexpected high value is coming from. The *irq* context can be >> delayed by hardware related actions, such as SMIs, NMIs, IRQs >> or by a thread masking interrupts. Once the timer happens, the delay >> can also be influenced by blocking caused by threads. For example, by >> postponing the scheduler execution via preempt_disable(), by the >> scheduler execution, or by masking interrupts. Threads can >> also be delayed by the interference from other threads and IRQs. > > > I wonder if ASCII art would help clarify the above. At least for the > document (not the change log here). > > > time ==> > expected actual thread > wakeup wakeup scheduled > | | | > v v v > |---------|-------|------|----------| > ^ > | > interrupt > > |--------------| > irq latency > > |-----------| > thread latency > A liked the idea of adding a timeline! [...] >> + [root@f32 tracing]# cat trace >> + # tracer: timerlat >> + # >> + # _-----=> irqs-off >> + # / _----=> need-resched >> + # | / _---=> hardirq/softirq >> + # || / _--=> preempt-depth >> + # || / >> + # |||| ACTIVATION >> + # TASK-PID CPU# |||| TIMESTAMP ID CONTEXT LATENCY >> + # | | | |||| | | | | >> + <idle>-0 [000] d.h1 54.029328: #1 context irq timer_latency 932 ns >> + <...>-867 [000] .... 54.029339: #1 context thread timer_latency 11700 ns >> + <idle>-0 [001] dNh1 54.029346: #1 context irq timer_latency 2833 ns >> + <...>-868 [001] .... 54.029353: #1 context thread timer_latency 9820 ns >> + <idle>-0 [000] d.h1 54.030328: #2 context irq timer_latency 769 ns >> + <...>-867 [000] .... 54.030330: #2 context thread timer_latency 3070 ns >> + <idle>-0 [001] d.h1 54.030344: #2 context irq timer_latency 935 ns >> + <...>-868 [001] .... 54.030347: #2 context thread timer_latency 4351 ns >> + >> + >> +The tracer creates a per-cpu kernel thread with real-time priority that >> +prints two lines at every activation. The first is the *timer latency* >> +observed at the *hardirq* context before the activation of the thread. >> +The second is the *timer latency* observed by the thread, which is the >> +same level that cyclictest reports. The ACTIVATION ID field >> +serves to relate the *irq* execution to its respective *thread* execution. >> + >> +The *irq*/*thread* splitting is important to clarify at which context >> +the unexpected high value is coming from. The *irq* context can be >> +delayed by hardware related actions, such as SMIs, NMIs, IRQs >> +or by a thread masking interrupts. Once the timer happens, the delay >> +can also be influenced by blocking caused by threads. For example, by >> +postponing the scheduler execution via preempt_disable(), by the >> +scheduler execution, or by masking interrupts. Threads can >> +also be delayed by the interference from other threads and IRQs. > > This is where I would add that ASCII art. I am proposing a ASCII art on another point.... see bellow. > >> + >> +Tracer options >> +--------------------- >> + >> +The timerlat tracer is built on top of osnoise tracer. >> +So its configuration is also done in the osnoise/ config >> +directory. The timerlat configs are: >> + >> + - cpus: CPUs at which a timerlat thread will execute. >> + - timerlat_period_us: the period of the timerlat thread. >> + - osnoise/stop_tracing_in_us: stop the system tracing if a >> + timer latency at the *irq* context higher than the configured >> + value happens. Writing 0 disables this option. >> + - stop_tracing_out_us: stop the system tracing if a >> + timer latency at the *thread* context higher than the configured >> + value happens. Writing 0 disables this option. >> + - print_stack: save the stack of the IRQ ocurrence, and print >> + it after the *thread* read the latency. > > "thread read the latency" doesn't make sense. > > "and print it after the *thread context* event". ? > > >> + >> +timerlat and osnoise >> +---------------------------- >> + >> +The timerlat can also take advantage of the osnoise: traceevents. >> +For example:: >> + >> + [root@f32 ~]# cd /sys/kernel/tracing/ >> + [root@f32 tracing]# echo timerlat > current_tracer >> + [root@f32 tracing]# echo osnoise > set_event > > Note, set_event should be deprecated. Use: > > echo 1 > events/osnoise/enable > > instead. > > >> + [root@f32 tracing]# echo 25 > osnoise/stop_tracing_out_us >> + [root@f32 tracing]# tail -10 trace >> + cc1-87882 [005] d..h... 548.771078: #402268 context irq timer_latency 1585 ns >> + cc1-87882 [005] dNLh1.. 548.771082: irq_noise: local_timer:236 start 548.771077442 duration 4597 ns >> + cc1-87882 [005] dNLh2.. 548.771083: irq_noise: reschedule:253 start 548.771083017 duration 56 ns >> + cc1-87882 [005] dNLh2.. 548.771086: irq_noise: call_function_single:251 start 548.771083811 duration 2048 ns >> + cc1-87882 [005] dNLh2.. 548.771088: irq_noise: call_function_single:251 start 548.771086814 duration 1495 ns >> + cc1-87882 [005] dNLh2.. 548.771091: irq_noise: call_function_single:251 start 548.771089194 duration 1558 ns >> + cc1-87882 [005] dNLh2.. 548.771094: irq_noise: call_function_single:251 start 548.771091719 duration 1932 ns >> + cc1-87882 [005] dNLh2.. 548.771096: irq_noise: call_function_single:251 start 548.771094696 duration 1050 ns >> + cc1-87882 [005] d...3.. 548.771101: thread_noise: cc1:87882 start 548.771078243 duration 10909 ns >> + timerlat/5-1035 [005] ....... 548.771103: #402268 context thread timer_latency 25960 ns >> + >> +In this case, the root cause of the timer latency does not point for a >> +single, but to a series of call_function_single IPIs, followed by a 10 > > "not point to a single" > >> +*us* delay from a cc1 thread noise, along with the regular timer >> +activation. It is worth mentioning that the *duration* values reported >> +by the osnoise events are *net* values. For example, the >> +thread_noise does not include the duration of the overhead caused >> +by the IRQ execution (which indeed accounted for 12736 ns). > > As stated above, I updated my view of the ASCII art after reading this. You > should not compare what cyclictest reports as the thread latency. But what > cyclictest reports is the sum of the two (irq latency plus thread latency). Here is the point where I mention net values... so I am changing this part of documentation to: ------------------- %< ----------------------------- It is worth mentioning that the *duration* values reported by the osnoise: events are *net* values. For example, the thread_noise does not include the duration of the overhead caused by the IRQ execution (which indeed accounted for 12736 ns). But the values reported by the timerlat tracer (timerlat_latency) are *gross* values. The art below illustrates a CPU timeline and how the timerlat tracer observes it at the top and the osnoise: events at the bottom. Each "-" in the timelines means 1 us, and the time moves ==>: External context irq context thread clock timer_latency timer_latency event 18 us 48 us | ^ ^ v | | |------------------| | <-- timerlat irq timeline |------------------+-----------------------------| <-- timerlat thread timeline ^ ^ ===================== CPU timeline ====================================== [timerlat/ irq] [ dev irq ] [another thread...^ v..^ v........][timerlat/ thread] ===================== CPU timeline ====================================== |-------------| |---------| <-- irq_noise timeline |--^ v--------| <-- thread_noise timeline | | | | | + thread_noise: 10 us | +-> irq_noise: 9 us +-> irq_noise: 13 us --------------- >% -------------------------------- thoughts? -- Daniel