> > For XDP_REDIRECT, the performance show as follow. > > root@imx8mpevk:~# ./xdp_redirect eth1 eth0 Redirecting from eth1 > > (ifindex 3; driver st_gmac) to eth0 (ifindex 2; driver fec) > > This is not exactly the same as XDP_TX setup as here you choose to redirect > between eth1 (driver st_gmac) and to eth0 (driver fec). > > I would like to see eth0 to eth0 XDP_REDIRECT, so we can compare to > XDP_TX performance. > Sorry for all the requests, but can you provide those numbers? > Oh, sorry, I thought what you wanted were XDP_REDIRECT results for different NICs. Below is the result of XDP_REDIRECT on the same NIC. root@imx8mpevk:~# ./xdp_redirect eth0 eth0 Redirecting from eth0 (ifindex 2; driver fec) to eth0 (ifindex 2; driver fec) Summary 232,302 rx/s 0 err,drop/s 232,344 xmit/s Summary 234,579 rx/s 0 err,drop/s 234,577 xmit/s Summary 235,548 rx/s 0 err,drop/s 235,549 xmit/s Summary 234,704 rx/s 0 err,drop/s 234,703 xmit/s Summary 235,504 rx/s 0 err,drop/s 235,504 xmit/s Summary 235,223 rx/s 0 err,drop/s 235,224 xmit/s Summary 234,509 rx/s 0 err,drop/s 234,507 xmit/s Summary 235,481 rx/s 0 err,drop/s 235,482 xmit/s Summary 234,684 rx/s 0 err,drop/s 234,683 xmit/s Summary 235,520 rx/s 0 err,drop/s 235,520 xmit/s Summary 235,461 rx/s 0 err,drop/s 235,461 xmit/s Summary 234,627 rx/s 0 err,drop/s 234,627 xmit/s Summary 235,611 rx/s 0 err,drop/s 235,611 xmit/s Packets received : 3,053,753 Average packets/s : 234,904 Packets transmitted : 3,053,792 Average transmit/s : 234,907 > > I'm puzzled that moving the MMIO write isn't change performance. > > Can you please verify that the packet generator machine is sending more > frame than the system can handle? > > (meaning the pktgen_sample03_burst_single_flow.sh script fast enough?) > Thanks very much! You remind me, I always started the pktgen script first and then ran the xdp2 program in the previous tests. So I saw the transmit speed of the generator was always greater than the speed of XDP_TX when I stopped the script. But actually, the real-time transmit speed of the generator was degraded to as equal to the speed of XDP_TX. So I turned off the rx function of the generator in case of increasing the CPU loading of the generator due to the returned traffic from xdp2. And I tested the performance again. Below are the results. Result 1: current method root@imx8mpevk:~# ./xdp2 eth0 proto 17: 326539 pkt/s proto 17: 326464 pkt/s proto 17: 326528 pkt/s proto 17: 326465 pkt/s proto 17: 326550 pkt/s Result 2: sync_dma_len method root@imx8mpevk:~# ./xdp2 eth0 proto 17: 353918 pkt/s proto 17: 352923 pkt/s proto 17: 353900 pkt/s proto 17: 352672 pkt/s proto 17: 353912 pkt/s Note: the speed of the generator is about 935397pps. Compared result 1 with result 2. The "sync_dma_len" method actually improves the performance of XDP_TX, so the conclusion from the previous tests is *incorrect*. I'm so sorry for that. :( In addition, I also tried the "dma_sync_len" + not use xdp_convert_buff_to_frame() method, the performance has been further improved. Below is the result. Result 3: sync_dma_len + not use xdp_convert_buff_to_frame() method root@imx8mpevk:~# ./xdp2 eth0 proto 17: 369261 pkt/s proto 17: 369267 pkt/s proto 17: 369206 pkt/s proto 17: 369214 pkt/s proto 17: 369126 pkt/s Therefore, I'm intend to use the "dma_sync_len"+ not use xdp_convert_buff_to_frame() method in the V5 patch. Thank you again, Jesper and Jakub. You really helped me a lot. :)