On 2020-02-20 11:56:23 [-0500], Kansky, Jan E. wrote: > For both kernels, I run cyclictest: > ./cyclictest -p 98 --smp -b 600 -f -m you could use the -b option to stop the trace once you hit the 2ms latency. > The trace results show that the unexpected large latency seems to > occur in the following ways: > > llvmpipe-9436 3d...... 4901974us!: switch_fpu_return > <-prepare_exit_to_usermode > llvmpipe-9436 3d...... 4902845us : smp_apic_timer_interrupt > <-apic_timer_interrupt It is sometimes hard to read with the additional line feed. However, this is probably okay because you return to user space and come back later after a timer interrupt. > or > > Xorg-8825 3....... 4905576us!: kfree <-__audit_syscall_exit > <idle>-0 2d...1.. 4905876us : smp_apic_timer_interrupt > <-apic_timer_interrupt > > or > <idle>-0 2d...1.. 4905910us!: mwait_idle <-default_idle_call > <idle>-0 0d...1.. 4906049us : smp_apic_timer_interrupt > <-apic_timer_interrupt > > or > > llvmpipe-9435 1d...... 4917323us!: rcu_irq_exit <-irq_exit > llvmpipe-9438 3d...... 4917845us : smp_apic_timer_interrupt > <-apic_timer_interrupt The number before the d is the CPU number. So if you kfree() on CPU3 follwed by smp_apic_timer_interrupt() on CPU2 there is no need to worry. Same for the other two examples. You need to see what happens after that gap. > I do see NMIs occurring on the system, although not all latency events > seem to correlate with an increment in the NMI counter in > /proc/interrupts. perf may be responsible for some of them or the "hardware watchdog". If you suspect that the BIOS is doing something, there is the CONFIG_HWLAT_TRACER to proof it. > I would greatly appreciate any advice on what I should do to trace the > problem with this new system. I can send my .config files if needed. > CONFIG_PREEMPT_RT_FULL=y is set. > > Thanks! > Jan Sebastian