On Wed, Jun 12, 2024 at 5:50 AM Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> wrote: > > On Wed, Jun 12, 2024 at 01:47:06PM +0200, Magnus Karlsson wrote: > > On Tue, 11 Jun 2024 at 22:43, YiFei Zhu <zhuyifei@xxxxxxxxxx> wrote: > > > > > > We have observed that hardware NIC drivers may have faulty AF_XDP > > > implementations, and there seem to be a lack of a test of various modes > > > in which AF_XDP could run. This series adds a test to verify that NIC > > > drivers implements many AF_XDP features by performing a send / receive > > > of a single UDP packet. > > > > > > I put the C code of the test under selftests/bpf because I'm not really > > > sure how I'd build the BPF-related code without the selftests/bpf > > > build infrastructure. > > > > Happy to see that you are contributing a number of new tests. Would it > > be possible for you to integrate this into the xskxceiver framework? > > You can find that in selftests/bpf too. By default, it will run its > > tests using veth, but if you provide an interface name after the -i > > option, it will run the tests over a real interface. I put the NIC in > > loopback mode to use this feature, but feel free to add a new mode if > > necessary. A lot of the setup and data plane code that you add already > > exists in xskxceiver, so I would prefer if you could reuse it. Your > > tests are new though and they would be valuable to have. > > +1 > > I just don't believe that you guys were not aware that xskxceiver exist. > Please provide us a proper explanation/justification why this was not > fulfilling your needs and you decided to go with another test suite. To answer this question, I can't speak for others, but I personally was not fully aware. Over a year ago when we were testing AF_XDP latency on internal NIC drivers, we extended our internal latency test tool to support AF_XDP. And that was when we observed the NICs we were testing had faulty implementations - panics, packet corruptions, random drops; and we decided to simplify the latency suite to add a simple pass/fail test to our testing infrastructure, and we named it xsk_hw. The test was specifically designed to test hardware NICs (rather than veth), and there was a bunch of code around the test, to reserve & setup machines, and to obtain information such as the IP addresses and the host and next hop MACs addresses. At the time, the code was deemed too dependent on our internal multi-machine-testing infrastructure to upstream, but it has been running as part of our test suite since. This brings us to recently. I was informed that upstream now have drv-net, and now that upstream also has multi-machine testing, it's time to upstream it. Hence this patch series, which I made after adapting the code to use drv-net and network_helpers. As for xskxceiver, for me personally, I discarded the idea after reading the initial block comment of xskxceiver saying it spawns two threads in a veth pair to test AF_XDP, which in my mind was like "okay this doesn't test hardware NICs, and to extend that test to hardware is probably a major rewrite that is probably not worth", so I did not look too deeply into its code. I personally was unaware that it can test a real interface, and that's partially my fault. I'll take a look at xskxceiver and see how feasible it is to integrate this into xskxceiver. > > > > You could make the default packet that is sent in xskxceiver be the > > UDP packet that you want and then add all the other logic that you > > have to a number of new tests that you introduce. > > > > > Tested on Google Cloud, with GVE: > > > > > > $ sudo NETIF=ens4 REMOTE_TYPE=ssh \ > > > REMOTE_ARGS="root@10.138.15.235" \ > > > LOCAL_V4="10.138.15.234" \ > > > REMOTE_V4="10.138.15.235" \ > > > LOCAL_NEXTHOP_MAC="42:01:0a:8a:00:01" \ > > > REMOTE_NEXTHOP_MAC="42:01:0a:8a:00:01" \ > > > python3 xsk_hw.py > > > > > > KTAP version 1 > > > 1..22 > > > ok 1 xsk_hw.ipv4_basic > > > ok 2 xsk_hw.ipv4_tx_skb_copy > > > ok 3 xsk_hw.ipv4_tx_skb_copy_force_attach > > > ok 4 xsk_hw.ipv4_rx_skb_copy > > > ok 5 xsk_hw.ipv4_tx_drv_copy > > > ok 6 xsk_hw.ipv4_tx_drv_copy_force_attach > > > ok 7 xsk_hw.ipv4_rx_drv_copy > > > [...] > > > # Exception| STDERR: b'/tmp/zzfhcqkg/pbgodkgjxsk_hw: recv_pfpacket: Timeout\n' > > > not ok 8 xsk_hw.ipv4_tx_drv_zerocopy > > > ok 9 xsk_hw.ipv4_tx_drv_zerocopy_force_attach > > > ok 10 xsk_hw.ipv4_rx_drv_zerocopy > > > [...] > > > # Exception| STDERR: b'/tmp/zzfhcqkg/pbgodkgjxsk_hw: connect sync client: max_retries\n' > > > [...] > > > # Exception| STDERR: b'/linux/tools/testing/selftests/bpf/xsk_hw: open_xsk: Device or resource busy\n' > > > not ok 11 xsk_hw.ipv4_rx_drv_zerocopy_fill_after_bind > > > ok 12 xsk_hw.ipv6_basic # SKIP Test requires IPv6 connectivity > > > [...] > > > ok 22 xsk_hw.ipv6_rx_drv_zerocopy_fill_after_bind # SKIP Test requires IPv6 connectivity > > > # Totals: pass:9 fail:2 xfail:0 xpass:0 skip:11 error:0 > > > > > > YiFei Zhu (3): > > > selftests/bpf: Move rxq_num helper from xdp_hw_metadata to > > > network_helpers > > > selftests/bpf: Add xsk_hw AF_XDP functionality test > > > selftests: drv-net: Add xsk_hw AF_XDP functionality test > > > > > > tools/testing/selftests/bpf/.gitignore | 1 + > > > tools/testing/selftests/bpf/Makefile | 7 +- > > > tools/testing/selftests/bpf/network_helpers.c | 27 + > > > tools/testing/selftests/bpf/network_helpers.h | 16 + > > > tools/testing/selftests/bpf/progs/xsk_hw.c | 72 ++ > > > tools/testing/selftests/bpf/xdp_hw_metadata.c | 27 +- > > > tools/testing/selftests/bpf/xsk_hw.c | 844 ++++++++++++++++++ > > > .../testing/selftests/drivers/net/hw/Makefile | 1 + > > > .../selftests/drivers/net/hw/xsk_hw.py | 133 +++ > > > 9 files changed, 1102 insertions(+), 26 deletions(-) > > > create mode 100644 tools/testing/selftests/bpf/progs/xsk_hw.c > > > create mode 100644 tools/testing/selftests/bpf/xsk_hw.c > > > create mode 100755 tools/testing/selftests/drivers/net/hw/xsk_hw.py > > > > > > -- > > > 2.45.2.505.gda0bf45e8d-goog > > > > > > > >