Hi there, We have been testing an XDP program with Intel E810 / ICE driver. The program uses AF_XDP in Zero Copy mode. In some conditions, the program logic decides to return XDP_TX action rather than XDP_REDIRECT, so packets are transmitted back out the interface they arrived on. (In this condition we also swap the MAC addresses in Ethernet header within the BPF program). We recently identified and submitted a bug report onto XDP-newbies, as we were seeing memory type errors. [1] The patch supplied by Maciej fixed our initial problem of an incorrect XDP memory type. However, upon further testing on Kernel 6.1-rc7 with Maciej’s patch applied we have noticed some further strange behaviour with larger packet rates and more channels. It seems that when we use multiple channels with dedicated F/C queues, some of the channels stall and do not transmit any of the packets back out of the interface. We have been running a slightly modified xdpsock based program as a test harness on the test_zero_copy_tx branch [2] as per the previous reports. (Again, we’ve tested with the version of the program that enables multiple F/C queues per XSK). This XDP program now has some percpu stats added into it, so we can have visibility on what actions we are returning from the XDP prog. We observe that while the BPF program does receive every packet and return XDP_TX properly, some of the channels experience a TX failure, leading to packet loss. This led us to do a little bit of digging and we noticed that at a certain point, when a channel stopped transmitting packets back out the NIC, the tx_busy counter is being incremented. We looked into the ICE driver code and we believe this is originating from ice_xmit_xdp_ring() at ice_txrx_lib.c:288. We progressed to enable tracing on XDP events; we can see an xdp_exception trace for every failed transmit operation. We added two additional tracepoints in the XDP_TX code path within the ice_run_xdp_zc() function of ice_xsk.c file – one on successful TX, one on failed TX – with both highlighting xdp_ring->q_index and rx_ring->q_index values. Our rationale was to try and identify if there was a pattern to which channels were failing. We were able to establish with reasonable degree of certainty: A) Tracepoints following XDP_TX failure always had a TX/XDP ring q_index of 15 or less. B) Tracepoints following XDP_TX success always had a TX/XDP ring q_index of 16 or above. C) Both points above are true irrespective of the number of channels we had configured (1, 2, 4, 8) D) Tx/Rx ring sizes did not seem to alter behaviour – we tested with tx/rx rings of 1024, 2048, 4096. We think this might be a bug within the ICE driver, but we’ve only picked it up following resolution of the previous issue [1]. Our test rig happens to be a 16 core (dual socket) so we think that might have some bearing on the above. Due to the test machine being NUMA we have avoided testing with channel quantities above 8, but we’d be happy to if it would add value. We were testing this on the kernel version 6.1.0-rc7+ with the patch supplied previously by Maciej. Everything above was seen using Intel E810’s with the parameters [3] below. Any guidance would be appreciated. Thanks Robin [1] https://marc.info/?l=xdp-newbies&m=166973074709850&w=2 [2] https://github.com/OpenSource-THG/xdpsock-sample/tree/test_zero_copy_tx [3] Ethtool Parameters: # ethtool -i ice0 driver: ice version: 6.1.0-rc7+ firmware-version: 2.50 0x800077a8 1.2960.0 expansion-rom-version: bus-info: 0000:03:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes # lspci -s 03:00.0 03:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) # ethtool -g ice0 Ring parameters for ice0: Pre-set maximums: RX: 8160 RX Mini: n/a RX Jumbo: n/a TX: 8160 Current hardware settings: RX: 4096 RX Mini: n/a RX Jumbo: n/a TX: 4096 # ethtool -l ice0 Channel parameters for ice0: Pre-set maximums: RX: 16 TX: 16 Other: 1 Combined: 16 Current hardware settings: RX: 0 TX: 0 Other: 1 Combined: 4 Robin Cowley Software Engineer The Hut Group<http://www.thehutgroup.com/> Tel: Email: Robin.Cowley@xxxxxxxxxxxxxxx<mailto:Robin.Cowley@xxxxxxxxxxxxxxx> For the purposes of this email, the "company" means The Hut Group Limited, a company registered in England and Wales (company number 6539496) whose registered office is at Fifth Floor, Voyager House, Chicago Avenue, Manchester Airport, M90 3DQ and/or any of its respective subsidiaries. Confidentiality Notice This e-mail is confidential and intended for the use of the named recipient only. If you are not the intended recipient please notify us by telephone immediately on +44(0)1606 811888 or return it to us by e-mail. Please then delete it from your system and note that any use, dissemination, forwarding, printing or copying is strictly prohibited. Any views or opinions are solely those of the author and do not necessarily represent those of the company. Encryptions and Viruses Please note that this e-mail and any attachments have not been encrypted. They may therefore be liable to be compromised. Please also note that it is your responsibility to scan this e-mail and any attachments for viruses. We do not, to the extent permitted by law, accept any liability (whether in contract, negligence or otherwise) for any virus infection and/or external compromise of security and/or confidentiality in relation to transmissions sent by e-mail. Monitoring Activity and use of the company's systems is monitored to secure its effective use and operation and for other lawful business purposes. Communications using these systems will also be monitored and may be recorded to secure effective use and operation and for other lawful business purposes. hgvyjuv