Re: Multi-core scalability problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 14 Oct 2020 08:56:43 +0200
Federico Parola <fede.parola@xxxxxxxxxx> wrote:

[...]
> >
> > Can you try to use this[2] tool:
> >   ethtool_stats.pl --dev enp101s0f0
> >
> > And notice if there are any strange counters.
> >
> >
> > [2]https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
> > My best guess is that you have Ethernet flow-control enabled.
> > Some ethtool counter might show if that is the case.
> >  
> Here are the results of the tool:
> 
> 
> 1 FLOW:
> 
> Show adapter(s) (enp101s0f0) statistics (ONLY that changed!)
> Ethtool(enp101s0f0) stat:     35458700 (     35,458,700) <= port.fdir_sb_match /sec
> Ethtool(enp101s0f0) stat:   2729223958 (  2,729,223,958) <= port.rx_bytes /sec
> Ethtool(enp101s0f0) stat:      7185397 (      7,185,397) <= port.rx_dropped /sec
> Ethtool(enp101s0f0) stat:     42644155 (     42,644,155) <= port.rx_size_64 /sec
> Ethtool(enp101s0f0) stat:     42644140 (     42,644,140) <= port.rx_unicast /sec
> Ethtool(enp101s0f0) stat:   1062159456 (  1,062,159,456) <= rx-0.bytes /sec
> Ethtool(enp101s0f0) stat:     17702658 (     17,702,658) <= rx-0.packets /sec
> Ethtool(enp101s0f0) stat:   1062155639 (  1,062,155,639) <= rx_bytes /sec
> Ethtool(enp101s0f0) stat:     17756128 (     17,756,128) <= rx_dropped /sec
> Ethtool(enp101s0f0) stat:     17702594 (     17,702,594) <= rx_packets /sec
> Ethtool(enp101s0f0) stat:     35458743 (     35,458,743) <= rx_unicast /sec
> 
> ---
> 
> 
> 4 FLOWS:
> 
> Show adapter(s) (enp101s0f0) statistics (ONLY that changed!)
> Ethtool(enp101s0f0) stat:      9351001 (      9,351,001) <= port.fdir_sb_match /sec
> Ethtool(enp101s0f0) stat:   2559136358 (  2,559,136,358) <= port.rx_bytes /sec
> Ethtool(enp101s0f0) stat:     30635346 (     30,635,346) <= port.rx_dropped /sec
> Ethtool(enp101s0f0) stat:     39986386 (     39,986,386) <= port.rx_size_64 /sec
> Ethtool(enp101s0f0) stat:     39986799 (     39,986,799) <= port.rx_unicast /sec
> Ethtool(enp101s0f0) stat:    140177834 (    140,177,834) <= rx-0.bytes /sec
> Ethtool(enp101s0f0) stat:      2336297 (      2,336,297) <= rx-0.packets /sec
> Ethtool(enp101s0f0) stat:    140260002 (    140,260,002) <= rx-1.bytes /sec
> Ethtool(enp101s0f0) stat:      2337667 (      2,337,667) <= rx-1.packets /sec
> Ethtool(enp101s0f0) stat:    140261431 (    140,261,431) <= rx-2.bytes /sec
> Ethtool(enp101s0f0) stat:      2337691 (      2,337,691) <= rx-2.packets /sec
> Ethtool(enp101s0f0) stat:    140175690 (    140,175,690) <= rx-3.bytes /sec
> Ethtool(enp101s0f0) stat:      2336262 (      2,336,262) <= rx-3.packets /sec
> Ethtool(enp101s0f0) stat:    560877338 (    560,877,338) <= rx_bytes /sec
> Ethtool(enp101s0f0) stat:         3354 (          3,354) <= rx_dropped /sec
> Ethtool(enp101s0f0) stat:      9347956 (      9,347,956) <= rx_packets /sec
> Ethtool(enp101s0f0) stat:      9351183 (      9,351,183) <= rx_unicast /sec
> 
> 
> So if I understand the field port.rx_dropped represents packets dropped 
> due to a lack of buffer on the NIC while rx_dropped represents packets 
> dropped because upper layers aren't able to process them, am I right?
> 
> It seems that the problem is in the NIC.

Yes, it seems that the problem is in the NIC hardware, or config of the
NIC hardware.

Look at the counter "port.fdir_sb_match":
- 1 flow: 35,458,700 = port.fdir_sb_match /sec
- 4 flow:  9,351,001 = port.fdir_sb_match /sec

I think fdir_sb translates to Flow Director Sideband filter (in the
driver code this is sometimes related to "ATR" (Application Targeted
Routing)). (note: I've seen fdir_match before, but not the "sb"
fdir_sb_match part). This is happening inside the NIC HW/FW that does
filtering on flows and make sure same-flow goes to same RX-queue number
to avoid OOO packets. This looks like the limiting factor in your setup.

Have you installed any filters yourself?

Try to disable Flow Director:

 ethtool -K ethX ntuple <on|off>

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer





[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux