Re: XDP redirect max rate on Intel XL710

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Mon, 12 Apr 2021 15:48:10 +0200
Federico Parola <federico.parola@xxxxxxxxx> wrote:

> Hello everybody,
> when I redirect packets between two ports of an Intel XL710 40Gb card 
> (e.g. with the xdp_redirect_map example fo the kernel) I can not achieve 
> throughputs higher than ~31 Mpps.
> This NIC is not able to reach the theoretical ~60 Mpps with small

1/(84*8)*40*10^9 = 59,523,810 pps

> packets ([1] p. 23) but with DPDK I'm able to achieve ~40 Mpps with the 
> testpmd application and 2 cores.

Lets calculate the processing or arrival rate per packet (or between
packets arriving).

 31 Mpps = 32.25 nanosec  (1/31*(10^9/10^6))
 40 Mpps = 25    nanosec
 Difference = 7.26 nanosec.

Thus, you only have to optimize XDP redirect with 7 nanosec to catch up
with DPDK.  Assuming a 4GHz CPU (7*4) you get 28 cycles more with DPDK
that I'm sure your application really needs and use with carefully ;-)

> In XDP when dropping packets I achieve more or less the same throughput 
> with 3 cores, but I'm not able to exceed 31 Mpps when forwarding, no 
> matter how many cores I use.
> I tried tuning the size of the RX/TX rings and the DDIO occupancy but 
> with no success. I can scale with the number of cores more linearly but 
> as soon as I reach the 31 threshold cores usage decreases and throughput 
> remains the same.

If the CPU usage start do decrease (e.g. CPU gets idle cycles) then the
limitation is likely elsewhere, likely in the PCIe layer.  Or a limit
in the NIC hardware, but given DPDK is 7 nanosec faster then it might
not be the NIC hardware, but it could as DPDK could have disabled some
flow-dissector HW feature.  It can also be that XDP/kernel is doing more
PCIe operations than DPDK.

> I don't know if this is just related to my setup or my specific NIC (I 
> exprimented on kernels 5.11 and 5.9), does anybody know what the reason 
> could be?

Bjørn have recently optimized the kernel XDP redirect code path with
around 3-4 nanosec.  You can try with net-next kernel (Bjørn correct me
if this isn't true).

In your userspace AF_XDP application also make sure you don't use these
28 cycles / 7 nanosec for something that the DPDK testpmd doesn't.

> [1] 

Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat

[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux