Re: Network latency on 5.4 and 5.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In the trace below (5.10), the NET_RX softirq is executed very late by ksoftirqd
instead of directly by the irq thread.

Can I somehow force the irq thread to always handle the softirq? I fear that
raising the priority of ksoftirqd will cause all kinds of undesired latencies
because its a rather general thread.

While tracing, I also see some PI here and there, that introduce various amounts
of latencies. Is there a good way to find the mutex responsible for the PI?


    RTController-20346   [001] .....12  3001.164215: netif_rx: dev=lo skbaddr=00000000f2ce0335 len=60
    RTController-20346   [001] d...113  3001.164215: softirq_raise: vec=3 [action=NET_RX]
    RTController-20346   [001] .....12  3001.164215: netif_rx_exit: ret=0
    RTController-20346   [001] .....12  3001.164216: net_dev_xmit: dev=lo skbaddr=00000000f2ce0335 len=74 rc=0
    RTController-20346   [001] ...s.12  3001.164216: softirq_entry: vec=3 [action=NET_RX]
    RTController-20346   [001] ...s.12  3001.164216: netif_receive_skb: dev=lo skbaddr=00000000f2ce0335 len=60
<IRQ from returning EtherCAT packet>
    RTController-20346   [001] d..H212  3001.164217: irq_handler_entry: irq=131 name=ecat0-rx-0
    RTController-20346   [001] d..H212  3001.164217: irq_handler_exit: irq=131 ret=handled
    RTController-20346   [001] d..H412  3001.164217: sched_waking: comm=irq/131-ecat0-r pid=19970 prio=19 target_cpu=001
    RTController-20346   [001] dN.H512  3001.164218: sched_wakeup: comm=irq/131-ecat0-r pid=19970 prio=19 target_cpu=001
    RTController-20346   [001] dN.s112  3001.164218: rcu_utilization: Start context switch
    RTController-20346   [001] dN.s112  3001.164219: rcu_utilization: End context switch
    RTController-20346   [001] dN.s212  3001.164219: sched_stat_runtime: comm=RTController pid=20346 runtime=37773 [ns] vruntime=50306493110 [ns]
    RTController-20346   [001] d..s212  3001.164221: sched_switch: prev_comm=RTController prev_pid=20346 prev_prio=120 prev_state=R+ ==> next_comm=irq/131-ecat0-r next_pid=19970 next_prio=19
    RTController-20346   [001] d..s212  3001.164221: x86_fpu_regs_deactivated: x86/fpu: 00000000a8ca490d load: 0 xfeatures: 3 xcomp_bv: 8000000000000003
<IRQ thread starts working>
 irq/131-ecat0-r-19970   [001] d...2..  3001.164222: write_msr: 48, value 0
 irq/131-ecat0-r-19970   [001] d..h1..  3001.164222: local_timer_entry: vector=236
 irq/131-ecat0-r-19970   [001] d..h2..  3001.164222: hrtimer_cancel: hrtimer=00000000cbde1eb6
 irq/131-ecat0-r-19970   [001] d..h1..  3001.164222: hrtimer_expire_entry: hrtimer=00000000cbde1eb6 function=hrtimer_wakeup now=3001159523681
 irq/131-ecat0-r-19970   [001] d..h3..  3001.164223: sched_waking: comm=SafetyLogicShad pid=20142 prio=30 target_cpu=001
 irq/131-ecat0-r-19970   [001] d..h4..  3001.164223: sched_wakeup: comm=SafetyLogicShad pid=20142 prio=30 target_cpu=001
 irq/131-ecat0-r-19970   [001] d..h1..  3001.164223: hrtimer_expire_exit: hrtimer=00000000cbde1eb6
          <idle>-0       [000] ....1..  3001.164223: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164224: cpu_idle: state=0 cpu_id=0
 irq/131-ecat0-r-19970   [001] d..h1..  3001.164224: write_msr: 6e0, value 91a6aa5a118
 irq/131-ecat0-r-19970   [001] d..h1..  3001.164224: local_timer_exit: vector=236
<PI? WTF?>
 irq/131-ecat0-r-19970   [001] d...311  3001.164225: sched_pi_setprio: comm=RTController pid=20346 oldprio=120 newprio=19
 irq/131-ecat0-r-19970   [001] d...111  3001.164226: rcu_utilization: Start context switch
 irq/131-ecat0-r-19970   [001] d...111  3001.164226: rcu_utilization: End context switch
 irq/131-ecat0-r-19970   [001] d...211  3001.164227: sched_switch: prev_comm=irq/131-ecat0-r prev_pid=19970 prev_prio=19 prev_state=R+ ==> next_comm=RTController next_pid=20346 next_prio=19
    RTController-20346   [001] d..s212  3001.164227: write_msr: 48, value 4
    RTController-20346   [001] ...s.12  3001.164230: fib_table_lookup: table 254 oif 0 iif 1 proto 6 127.0.0.1/9000 -> 127.0.0.1/57200 tos 0 scope 0 flags 0 ==> dev lo gw 114.121.0.0/1901:102:4b4f::ffff:4574:6865 err 0
    RTController-20346   [001] ...s.12  3001.164231: kmem_cache_alloc_node: call_site=__alloc_skb+0x4f/0x200 ptr=000000000d68a52d bytes_req=224 bytes_alloc=256 gfp_flags=GFP_ATOMIC node=-1
    RTController-20346   [001] ...s.12  3001.164231: kmalloc_node: call_site=__ip_append_data+0xc08/0xdf0 ptr=000000007e7f4744 bytes_req=448 bytes_alloc=512 gfp_flags=GFP_ATOMIC|__GFP_NOWARN|__GFP_NOMEMALLOC node=-1
    RTController-20346   [001] ...s.12  3001.164232: kfree: call_site=__ip_make_skb+0x2f2/0x420 ptr=0000000000000000
          <idle>-0       [000] ....1..  3001.164233: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164233: cpu_idle: state=0 cpu_id=0
    RTController-20346   [001] ...s.12  3001.164234: net_dev_queue: dev=lo skbaddr=000000000d68a52d len=54
    RTController-20346   [001] ...s.12  3001.164234: net_dev_start_xmit: dev=lo queue_mapping=0 skbaddr=000000000d68a52d vlan_tagged=0 vlan_proto=0x0000 vlan_tci=0x0000 protocol=0x0800 ip_summed=0 len=54 data_len=0 network_offset=14 transport_offset_valid=1 transport_offset=34 tx_flags=0 gso_size=0 gso_segs=0 gso_type=0x0
    RTController-20346   [001] ...s.12  3001.164234: netif_rx_entry: dev=lo napi_id=0x0 queue_mapping=0 skbaddr=000000000d68a52d vlan_tagged=0 vlan_proto=0x0000 vlan_tci=0x0000 protocol=0x0800 ip_summed=0 hash=0x00000000 l4_hash=0 len=40 data_len=0 truesize=768 mac_header_valid=1 mac_header=-14 nr_frags=0 gso_size=0 gso_type=0x0
    RTController-20346   [001] ...s.12  3001.164234: netif_rx: dev=lo skbaddr=000000000d68a52d len=40
    RTController-20346   [001] ...s.12  3001.164234: netif_rx_exit: ret=0
    RTController-20346   [001] ...s.12  3001.164234: net_dev_xmit: dev=lo skbaddr=000000000d68a52d len=54 rc=0
    RTController-20346   [001] ...s.12  3001.164235: kfree_skb: skbaddr=00000000f2ce0335 protocol=2048 location=000000005f4796b3
    RTController-20346   [001] ...s.12  3001.164235: netif_receive_skb: dev=lo skbaddr=000000000d68a52d len=40
    RTController-20346   [001] ...s.12  3001.164237: kfree: call_site=nf_conntrack_free+0x1f/0x40 [nf_conntrack] ptr=00000000d1a8317d
    RTController-20346   [001] ...s.12  3001.164237: kmem_cache_free: call_site=nf_conntrack_free+0x2e/0x40 [nf_conntrack] ptr=000000006bfc0f40
    RTController-20346   [001] ...s.12  3001.164238: napi_poll: napi poll on napi struct 0000000081e7f255 for device (no_device) work 2 budget 64
    RTController-20346   [001] ...s.12  3001.164238: softirq_exit: vec=3 [action=NET_RX]
<Uff, only a short one. But my packet that just arrived on ecat0 has not been handled...>
    RTController-20346   [001] d...311  3001.164238: sched_pi_setprio: comm=RTController pid=20346 oldprio=19 newprio=120
    RTController-20346   [001] dN..311  3001.164240: sched_waking: comm=irq/131-ecat0-r pid=19970 prio=19 target_cpu=001
    RTController-20346   [001] dN..411  3001.164240: sched_wakeup: comm=irq/131-ecat0-r pid=19970 prio=19 target_cpu=001
    RTController-20346   [001] dN..311  3001.164241: sched_stat_runtime: comm=RTController pid=20346 runtime=2053 [ns] vruntime=50306496047 [ns]
    RTController-20346   [001] dN.h...  3001.164242: local_timer_entry: vector=236
    RTController-20346   [001] dN.h1..  3001.164242: softirq_raise: vec=8 [action=HRTIMER]
          <idle>-0       [000] ....1..  3001.164242: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164242: cpu_idle: state=0 cpu_id=0
    RTController-20346   [001] dN.h...  3001.164242: write_msr: 6e0, value 91a6aace636
    RTController-20346   [001] dN.h...  3001.164242: local_timer_exit: vector=236
    RTController-20346   [001] dN..2..  3001.164242: sched_waking: comm=ksoftirqd/1 pid=21 prio=120 target_cpu=001
    RTController-20346   [001] dN..3..  3001.164243: sched_wakeup: comm=ksoftirqd/1 pid=21 prio=120 target_cpu=001
    RTController-20346   [001] dN..1..  3001.164243: rcu_utilization: Start context switch
    RTController-20346   [001] dN..1..  3001.164243: rcu_utilization: End context switch
    RTController-20346   [001] dN..2..  3001.164243: sched_stat_runtime: comm=RTController pid=20346 runtime=2364 [ns] vruntime=50306498411 [ns]
    RTController-20346   [001] d...2..  3001.164244: sched_switch: prev_comm=RTController prev_pid=20346 prev_prio=120 prev_state=R+ ==> next_comm=irq/131-ecat0-r next_pid=19970 next_prio=19
< because it has not yet signaled the softirq? >
 irq/131-ecat0-r-19970   [001] d...211  3001.164245: write_msr: 48, value 0
 irq/131-ecat0-r-19970   [001] d....12  3001.164245: softirq_raise: vec=3 [action=NET_RX]
 irq/131-ecat0-r-19970   [001] d...1..  3001.164246: rcu_utilization: Start context switch
 irq/131-ecat0-r-19970   [001] d...1..  3001.164246: rcu_utilization: End context switch
 irq/131-ecat0-r-19970   [001] d...2..  3001.164247: sched_switch: prev_comm=irq/131-ecat0-r prev_pid=19970 prev_prio=19 prev_state=S ==> next_comm=SafetyLogicShad next_pid=20142 next_prio=30
 irq/131-ecat0-r-19970   [001] d...2..  3001.164249: tlb_flush: pages:0 reason:flush on task switch (0)
< hey, why is my packet not handled by the softirq right away? >
 SafetyLogicShad-20142   [001] d...2..  3001.164250: write_msr: 48, value 4
 SafetyLogicShad-20142   [001] ....1..  3001.164250: sys_nanosleep -> 0x0
 SafetyLogicShad-20142   [001] .......  3001.164250: sys_exit: NR 35 = 0
 SafetyLogicShad-20142   [001] d......  3001.164251: x86_fpu_regs_activated: x86/fpu: 00000000e034cb86 load: 1 xfeatures: 3 xcomp_bv: 8000000000000003
          <idle>-0       [000] ....1..  3001.164251: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164252: cpu_idle: state=0 cpu_id=0
          <idle>-0       [000] ....1..  3001.164261: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164261: cpu_idle: state=0 cpu_id=0
          <idle>-0       [000] ....1..  3001.164270: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164270: cpu_idle: state=0 cpu_id=0
          <idle>-0       [000] ....1..  3001.164279: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164279: cpu_idle: state=0 cpu_id=0
          <idle>-0       [000] ....1..  3001.164289: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164289: cpu_idle: state=0 cpu_id=0
          <idle>-0       [000] ....1..  3001.164298: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164298: cpu_idle: state=0 cpu_id=0
          <idle>-0       [000] ....1..  3001.164307: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164308: cpu_idle: state=0 cpu_id=0
          <idle>-0       [000] ....1..  3001.164317: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] d...1..  3001.164317: cpu_idle: state=0 cpu_id=0
          <idle>-0       [000] d..h1..  3001.164324: local_timer_entry: vector=236
          <idle>-0       [000] d..h2..  3001.164324: hrtimer_cancel: hrtimer=00000000eb8a586a
          <idle>-0       [000] d..h1..  3001.164324: hrtimer_expire_entry: hrtimer=00000000eb8a586a function=hrtimer_wakeup now=3001159625341
          <idle>-0       [000] d..h3..  3001.164324: sched_waking: comm=EtherlabDaemon pid=20787 prio=20 target_cpu=000
          <idle>-0       [000] dN.h4..  3001.164325: sched_wakeup: comm=EtherlabDaemon pid=20787 prio=20 target_cpu=000
          <idle>-0       [000] dN.h1..  3001.164325: hrtimer_expire_exit: hrtimer=00000000eb8a586a
          <idle>-0       [000] dN.h1..  3001.164325: write_msr: 6e0, value 91a6aac112e
          <idle>-0       [000] dN.h1..  3001.164325: local_timer_exit: vector=236
          <idle>-0       [000] .N..1..  3001.164326: cpu_idle: state=4294967295 cpu_id=0
          <idle>-0       [000] dN..1..  3001.164326: rcu_utilization: Start context switch
          <idle>-0       [000] dN..1..  3001.164326: rcu_utilization: End context switch
          <idle>-0       [000] d...2..  3001.164326: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=EtherlabDaemon next_pid=20787 next_prio=20
  EtherlabDaemon-20787   [000] d...2..  3001.164327: write_msr: 48, value 4
  EtherlabDaemon-20787   [000] ....1..  3001.164327: sys_nanosleep -> 0x0
  EtherlabDaemon-20787   [000] .......  3001.164327: sys_exit: NR 35 = 0
  EtherlabDaemon-20787   [000] d......  3001.164327: x86_fpu_regs_activated: x86/fpu: 000000007cc67df0 load: 1 xfeatures: 3 xcomp_bv: 8000000000000003
< well, this sucks, no packet is ready for me... >
  EtherlabDaemon-20787   [000] ....1..  3001.164330: sys_recvfrom(fd: a, ubuf: a90c10, size: 640, flags: 40, addr: 0, addr_len: 0)
  EtherlabDaemon-20787   [000] .......  3001.164330: sys_enter: NR 45 (a, a90c10, 640, 40, 0, 0)
  EtherlabDaemon-20787   [000] ....1..  3001.164331: sys_recvfrom -> 0xfffffffffffffff5
  EtherlabDaemon-20787   [000] .......  3001.164331: sys_exit: NR 45 = -11
 SafetyLogicShad-20142   [001] ....1..  3001.164337: sys_nanosleep(rqtp: 7ffef5e5f960, rmtp: 0)
 SafetyLogicShad-20142   [001] .......  3001.164337: sys_enter: NR 35 (7ffef5e5f960, 0, 1, 32f4, 3e8, 4)
  EtherlabDaemon-20787   [000] ....1..  3001.164337: sys_sendto(fd: a, buff: a8cee0, len: 36a, flags: 0, addr: 0, addr_len: 0)
 SafetyLogicShad-20142   [001] .......  3001.164337: hrtimer_init: hrtimer=00000000cbde1eb6 clockid=CLOCK_MONOTONIC mode=REL
  EtherlabDaemon-20787   [000] .......  3001.164337: sys_enter: NR 44 (a, a8cee0, 36a, 0, 0, 0)
 SafetyLogicShad-20142   [001] d...1..  3001.164338: hrtimer_start: hrtimer=00000000cbde1eb6 function=hrtimer_wakeup expires=3001160639092 softexpires=3001160639092 mode=0x9
 SafetyLogicShad-20142   [001] d...1..  3001.164338: rcu_utilization: Start context switch
  EtherlabDaemon-20787   [000] .......  3001.164338: kmem_cache_alloc_node: call_site=__alloc_skb+0x4f/0x200 ptr=000000005b815b30 bytes_req=224 bytes_alloc=256 gfp_flags=GFP_KERNEL node=-1
 SafetyLogicShad-20142   [001] d...1..  3001.164338: rcu_utilization: End context switch
  EtherlabDaemon-20787   [000] .......  3001.164338: kmalloc_node: call_site=alloc_skb_with_frags+0x4d/0x1b0 ptr=0000000091eb5053 bytes_req=1216 bytes_alloc=2048 gfp_flags=GFP_KERNEL|__GFP_NOWARN|__GFP_NOMEMALLOC node=-1
  EtherlabDaemon-20787   [000] .....12  3001.164339: net_dev_queue: dev=ecat0 skbaddr=000000005b815b30 len=874
  EtherlabDaemon-20787   [000] .....15  3001.164339: net_dev_start_xmit: dev=ecat0 queue_mapping=0 skbaddr=000000005b815b30 vlan_tagged=0 vlan_proto=0x0000 vlan_tci=0x0000 protocol=0x88a4 ip_summed=0 len=874 data_len=0 network_offset=14 transport_offset_valid=0 transport_offset=65533 tx_flags=0 gso_size=0 gso_segs=0 gso_type=0x0
 SafetyLogicShad-20142   [001] d...2..  3001.164339: sched_switch: prev_comm=SafetyLogicShad prev_pid=20142 prev_prio=30 prev_state=S ==> next_comm=ksoftirqd/1 next_pid=21 next_prio=120
  EtherlabDaemon-20787   [000] .....15  3001.164340: net_dev_xmit: dev=ecat0 skbaddr=000000005b815b30 len=874 rc=0
 SafetyLogicShad-20142   [001] d...2..  3001.164340: x86_fpu_regs_deactivated: x86/fpu: 00000000e034cb86 load: 0 xfeatures: 3 xcomp_bv: 8000000000000003
  EtherlabDaemon-20787   [000] .....14  3001.164340: qdisc_dequeue: dequeue ifindex=2 qdisc handle=0x0 parent=0x1 txq_state=0x0 packets=0 skbaddr=0000000000000000
  EtherlabDaemon-20787   [000] ....1..  3001.164340: sys_sendto -> 0x36a
  EtherlabDaemon-20787   [000] .......  3001.164340: sys_exit: NR 44 = 874
< only now the packet is being handled >
     ksoftirqd/1-21      [001] d...2..  3001.164340: write_msr: 48, value 0
     ksoftirqd/1-21      [001] ...s.12  3001.164341: softirq_entry: vec=3 [action=NET_RX]
     ksoftirqd/1-21      [001] ...s.12  3001.164342: kmem_cache_alloc: call_site=__build_skb+0x1f/0xe0 ptr=000000006c4a30be bytes_req=224 bytes_alloc=256 gfp_flags=GFP_ATOMIC
     ksoftirqd/1-21      [001] ...s.12  3001.164342: napi_gro_receive_entry: dev=ecat0 napi_id=0x0 queue_mapping=1 skbaddr=000000006c4a30be vlan_tagged=0 vlan_proto=0x0000 vlan_tci=0x0000 protocol=0x88a4 ip_summed=0 hash=0x00000000 l4_hash=0 len=874 data_len=0 truesize=2304 mac_header_valid=1 mac_header=-14 nr_frags=0 gso_size=0 gso_type=0x0
     ksoftirqd/1-21      [001] ...s.12  3001.164342: napi_gro_receive_exit: ret=3
     ksoftirqd/1-21      [001] ...s.12  3001.164342: netif_receive_skb: dev=ecat0 skbaddr=000000006c4a30be len=874
     ksoftirqd/1-21      [001] ...s.12  3001.164343: napi_poll: napi poll on napi struct 0000000028309abc for device ecat0 work 1 budget 64
  EtherlabDaemon-20787   [000] ....1..  3001.164343: sys_futex(uaddr: 7f8d9fca7000, op: 1, val: 1, utime: 0, uaddr2: 5b9663, val3: bb9)
     ksoftirqd/1-21      [001] ...s.12  3001.164343: softirq_exit: vec=3 [action=NET_RX]
  EtherlabDaemon-20787   [000] .......  3001.164343: sys_enter: NR 202 (7f8d9fca7000, 1, 1, 0, 5b9663, bb9)
     ksoftirqd/1-21      [001] ...s.12  3001.164343: softirq_entry: vec=8 [action=HRTIMER]



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux