Hi all! > > Hi Julia, > > thanks for the detailed analysis! > > > > [...] > > Okay, we finally received our wakeup event. We were expecting to be > > woken up at 10024735653388ns, but were actually woken up at > 10024735682387ns. > > > > 10024735682387 - 10024735653388 = 28999ns > > > > Our timer fired ~29us late! But why...? > > > > Sorry I don't have answers, just more questions. I do wonder what > > kworker/0:3 was up to at the time the timer interrupt should have fired. > > > > Julia > I have now instrumented the igb driver to generate trace points that allows me to > identify in more details what's going on. > This delivered interesting results! > > Here are some places where I added traces: > In file igb_ptp.c: > void igb_ptp_rx_hang(struct igb_adapter *adapter) { > struct e1000_hw *hw = &adapter->hw; > unsigned long rx_event; > u32 tsyncrxctl; > trace_igb(700); > tsyncrxctl = rd32(E1000_TSYNCRXCTL); > trace_igb(701); > > /* Other hardware uses per-packet timestamps */ > if (hw->mac.type != e1000_82576) > return; > ... > > In file igb_main.c: > static void igb_check_lvmmc(struct igb_adapter *adapter) { > struct e1000_hw *hw = &adapter->hw; > u32 lvmmc; > > trace_igb(600); > lvmmc = rd32(E1000_LVMMC); > trace_igb(601); > if (lvmmc) { > ... > > When I run now my test, I get the following trace: > [...] > kworker/-607 0....... 107315621us+: igb: val: 700 > kworker/-607 0d..h... 107315640us : irq_handler_entry: irq=47 name=eth2-rx-0 > kworker/-607 0d..h... 107315640us : irq_handler_exit: irq=47 ret=handled > kworker/-607 0d..h1.. 107315640us : sched_waking: comm=irq/47-eth2-rx- > pid=18009 prio=49 target_cpu=000 > kworker/-607 0dN.h2.. 107315641us : sched_wakeup: comm=irq/47-eth2-rx- > pid=18009 prio=49 target_cpu=000 > kworker/-607 0dN.h1.. 107315643us : irq_handler_entry: irq=48 name=eth2-tx-0 > kworker/-607 0dN.h1.. 107315643us : irq_handler_exit: irq=48 ret=handled > kworker/-607 0dN.h2.. 107315643us : sched_waking: comm=irq/48-eth2-tx- > pid=18010 prio=49 target_cpu=000 > kworker/-607 0dN.h3.. 107315644us : sched_wakeup: comm=irq/48-eth2-tx- > pid=18010 prio=49 target_cpu=000 > kworker/-607 0dN..1.. 107315644us : rcu_utilization: Start context switch > kworker/-607 0dN..1.. 107315644us : rcu_utilization: End context switch > kworker/-607 0dN..2.. 107315644us : sched_stat_runtime: comm=kworker/0:1 > pid=607 runtime=88996 [ns] vruntime=49754678074 [ns] > kworker/-607 0d...2.. 107315645us : sched_switch: prev_comm=kworker/0:1 > prev_pid=607 prev_prio=120 prev_state=R+ ==> next_comm=irq/47-eth2-rx- > next_pid=18009 next_prio=49 > irq/47-e-18009 0d....11 107315646us : softirq_raise: vec=3 [action=NET_RX] > irq/47-e-18009 0.....12 107315646us : softirq_entry: vec=3 [action=NET_RX] > irq/47-e-18009 0.....12 107315647us : napi_poll: napi poll on napi struct > ffff88040ae58c50 for device eth2 work 0 budget 64 > irq/47-e-18009 0.....12 107315647us : softirq_exit: vec=3 [action=NET_RX] > irq/47-e-18009 0d...1.. 107315648us : rcu_utilization: Start context switch > irq/47-e-18009 0d...1.. 107315648us : rcu_utilization: End context switch > irq/47-e-18009 0d...2.. 107315648us : sched_switch: prev_comm=irq/47-eth2-rx- > prev_pid=18009 prev_prio=49 prev_state=S ==> next_comm=irq/48-eth2-tx- > next_pid=18010 next_prio=49 > irq/48-e-18010 0d....11 107315649us : softirq_raise: vec=3 [action=NET_RX] > irq/48-e-18010 0.....12 107315649us : softirq_entry: vec=3 [action=NET_RX] > irq/48-e-18010 0.....12 107315650us : napi_poll: napi poll on napi struct > ffff88040ae5f450 for device eth2 work 0 budget 64 > irq/48-e-18010 0.....12 107315650us : softirq_exit: vec=3 [action=NET_RX] > irq/48-e-18010 0d...1.. 107315651us : rcu_utilization: Start context switch > irq/48-e-18010 0d...1.. 107315651us : rcu_utilization: End context switch > irq/48-e-18010 0d...2.. 107315651us : sched_switch: prev_comm=irq/48-eth2-tx- > prev_pid=18010 prev_prio=49 prev_state=S ==> next_comm=kworker/0:1 > next_pid=607 next_prio=120 > kworker/-607 0....... 107315652us : igb: val: 701 > kworker/-607 0....... 107315652us : igb: val: 106 > kworker/-607 0....... 107315652us : igb: val: 107 > kworker/-607 0....... 107315652us+: igb: val: 600 > kworker/-607 0d..h... 107315689us : local_timer_entry: vector=239 > kworker/-607 0d..h1.. 107315689us : hrtimer_interrupt: cpu=0 offset=-34521 > curr=kworker/0:1[120] thread=cyclictest[19] > kworker/-607 0d..h1.. 107315689us : hrtimer_cancel: hrtimer=ffff8803d42efe18 > kworker/-607 0d..h... 107315689us : hrtimer_expire_entry: > hrtimer=ffff8803d42efe18 function=hrtimer_wakeup now=752735681960 > kworker/-607 0d..h1.. 107315689us : sched_waking: comm=cyclictest pid=18015 > prio=19 target_cpu=000 > kworker/-607 0dN.h2.. 107315690us : sched_wakeup: comm=cyclictest > pid=18015 prio=19 target_cpu=000 > kworker/-607 0dN.h... 107315690us : hrtimer_expire_exit: > hrtimer=ffff8803d42efe18 > kworker/-607 0dN.h1.. 107315690us : hrtimer_interrupt: cpu=0 offset=318040 > curr=kworker/0:1[120] thread=<none>[-1] > kworker/-607 0dN.h... 107315690us : write_msr: 6e0, value 28096cdb9ce > kworker/-607 0dN.h... 107315690us : local_timer_exit: vector=239 > kworker/-607 0dN..1.. 107315690us : rcu_utilization: Start context switch > kworker/-607 0dN..1.. 107315691us : rcu_utilization: End context switch > kworker/-607 0dN..2.. 107315691us : sched_stat_runtime: comm=kworker/0:1 > pid=607 runtime=38439 [ns] vruntime=49754716513 [ns] > kworker/-607 0d...2.. 107315691us : sched_switch: prev_comm=kworker/0:1 > prev_pid=607 prev_prio=120 prev_state=R+ ==> next_comm=cyclictest > next_pid=18015 next_prio=19 > kworker/-607 0d...2.. 107315691us : x86_fpu_regs_activated: x86/fpu: > ffff8803f7f55940 fpregs_active: 1 fpstate_active: 1 counter: 99 xfeatures: 2 > xcomp_bv: 0 > kworker/-607 0d...2.. 107315691us : write_msr: c0000100, value 7ffff7400700 > cyclicte-18015 0....... 107315692us : sys_exit: NR 230 = 0 > cyclicte-18015 0....... 107315697us : sys_enter: NR 1 (5, 7ffff7400300, 1f, > 7ffff77a5460, 2, 7ffff744c99a) > cyclicte-18015 0.....11 107315698us : tracing_mark_write: hit latency threshold (37 > > 33) > cyclicte-18015 0....... 107315699us : sys_exit: NR 1 = 31 > cyclicte-18015 0....... 107315699us : sys_enter: NR 1 (4, 4076b0, 2, 7ffff77a5460, > 2, 7ffff744c99a) > > > Very interesting is also the trace that I get in an idle system - without cyclictest > running. > When I just enable my igb tracepoint I got the following result: > [...] > kworker/-607 0....... 585779012us+: igb: val: 700 > kworker/-607 0....... 585779042us : igb: val: 701 > kworker/-607 0....... 585779042us : igb: val: 106 > kworker/-607 0....... 585779043us : igb: val: 107 > kworker/-607 0....... 585779043us+: igb: val: 600 > kworker/-607 0....... 585779080us : igb: val: 601 > > The time between my trace points 700 and 701 is about 30us, the time between my > trace points 600 and 601 is even 37us!! > The code in between is > tsyncrxctl = rd32(E1000_TSYNCRXCTL); resp. > lvmmc = rd32(E1000_LVMMC); > > In both cases this is a single read from a register. > I have no idea why this single read could take that much time! > Is it possible that the igb hardware is in a state that delays the read access and this is > why the whole I/O system might be delayed? > To have a proper comparison, I did the same with kernel 3.18.27-rt27. Also here, I instrumented the igb driver to get traces for the rd32 calls. However, here everything is generally much faster! In the idle system the maximum I got for a read was about 6us, most times it was 1-2us. On the 4.8 kernel this is always much slower (see above). My question is now: Is there any kernel config option that has been introduced in the meantime that may lead to this effect and which is not set in my 4.8 config? Regards Mathias -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html