Re: [RFC Optimizing veth xsk performance 00/10]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 03/08/2023 16.04, huangjie.albert wrote:
AF_XDP is a kernel bypass technology that can greatly improve performance.
However, for virtual devices like veth, even with the use of AF_XDP sockets,
there are still many additional software paths that consume CPU resources.
This patch series focuses on optimizing the performance of AF_XDP sockets
for veth virtual devices. Patches 1 to 4 mainly involve preparatory work.
Patch 5 introduces tx queue and tx napi for packet transmission, while
patch 9 primarily implements zero-copy, and patch 10 adds support for
batch sending of IPv4 UDP packets. These optimizations significantly reduce
the software path and support checksum offload.

I tested those feature with
A typical topology is shown below:
veth<-->veth-peer                                    veth1-peer<--->veth1
	1       |                                                  |   7
	        |2                                                6|
	        |                                                  |
	      bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1
                   3                    4                 5
              (machine1)                              (machine2)
AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0)
veth:(172.17.0.2/24)
bridge:(172.17.0.1/24)
eth0:(192.168.156.66/24)

eth1(172.17.0.2/24)
bridge1:(172.17.0.1/24)
eth0:(192.168.156.88/24)

after set default route、snat、dnat. we can have a tests
to get the performance results.

packets send from veth to veth1:
af_xdp test tool:
link:https://github.com/cclinuxer/libxudp
send:(veth)
./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300
recv:(veth1)
./objs/xudpperf recv --src 172.17.0.2:6002

udp test tool:iperf3
send:(veth)
iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 60G -u
recv:(veth1)
iperf3 -s -p 6002

performance:
performance:(test weth libxdp lib)
UDP                              : 250 Kpps (with 100% cpu)
AF_XDP   no  zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu)
AF_XDP  with zerocopy + no batch : 540 Kpps (with ksoftirqd 100% cpu)
AF_XDP  with  batch  +  zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu)

With af_xdp batch, the libxdp user-space program reaches a bottleneck.

Do you mean libxdp [1] or libxudp ?

[1] https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp

Therefore, the softirq did not reach the limit.

This is just an RFC patch series, and some code details still need
further consideration. Please review this proposal.


I find this performance work interesting as we have customer requests
(via Maryam (cc)) to improve AF_XDP performance both native and on veth.

Our benchmark is stored at:
 https://github.com/maryamtahhan/veth-benchmark

Great to see other companies also interested in this area.

--Jesper

thanks!

huangjie.albert (10):
   veth: Implement ethtool's get_ringparam() callback
   xsk: add dma_check_skip for  skipping dma check
   veth: add support for send queue
   xsk: add xsk_tx_completed_addr function
   veth: use send queue tx napi to xmit xsk tx desc
   veth: add ndo_xsk_wakeup callback for veth
   sk_buff: add destructor_arg_xsk_pool for zero copy
   xdp: add xdp_mem_type MEM_TYPE_XSK_BUFF_POOL_TX
   veth: support zero copy for af xdp
   veth: af_xdp tx batch support for ipv4 udp

  drivers/net/veth.c          | 729 +++++++++++++++++++++++++++++++++++-
  include/linux/skbuff.h      |   1 +
  include/net/xdp.h           |   1 +
  include/net/xdp_sock_drv.h  |   1 +
  include/net/xsk_buff_pool.h |   1 +
  net/xdp/xsk.c               |   6 +
  net/xdp/xsk_buff_pool.c     |   3 +-
  net/xdp/xsk_queue.h         |  11 +
  8 files changed, 751 insertions(+), 2 deletions(-)





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux