Hi there,

We're looking at developing some software that uses XSK in zero copy mode, where we either redirect packets to userspace using AF_XDP, or transmit packets straight from the XDP kernel program using XDP_TX.

Our program is the same one as described here:

Recently we've been testing some functionality that transmits packets directly from the data plane / XDP code using XDP_TX. This functionality works on a mellanox MT27710 ConnectX-4 Lx NIC using mlx5_core driver. However, using an Intel NIC with the ice driver, we have some problems. This was tested on the 5.15 kernel and on the newer 6.1 kernel and they both result in the same behaviour.

Everything below was seen using the intel NIC with these configs:

# ethtool -i ice0
driver: ice
version: 6.1.0-0.rc5.el8.elrepo.x86_64
firmware-version: 2.50 0x800077a8 1.2960.0
bus-info: 0000:03:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

# lspci -s 03:00.0
03:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)

# ethtool -g ice0
Ring parameters for ice0:
Pre-set maximums:
RX:         8160
RX Mini:    n/a
RX Jumbo:   n/a
TX:         8160
Current hardware settings:
RX:         4096
RX Mini:    n/a
RX Jumbo:   n/a
TX:         4096

# ethtool -l ice0
Channel parameters for ice0:
Pre-set maximums:
RX:         16
TX:         16
Other:       1
Combined:   16
Current hardware settings:
RX:                0
TX:         0
Other:            1
Combined:   4

When redirecting traffic from the data plane into user-space via XSK, everything works as expected.

When transmitting packets from the data plane directly out the NIC via XDP_TX, we can see our kernel logs getting hit through the systemd-journal process. It seems to be for every packet sent through XDP_TX, it's generating a kernel warning.

An example warning and call trace is:

Incorrect XDP memory type (1785255936) usage
WARNING: CPU: 7 PID: 0 at net/core/xdp.c:403 __xdp_return+0x33/0x1f0


Call Trace:
ice_xmit_zc+0x251/0x310 [ice]
ice_napi_poll+0x54/0x640 [ice]

The memory type value seen above changes each error, suggesting that the value is uninitialized or the pointer is corrupted.

We have been able to recreate the issue using a program based on the xdpsock sample programs from the kernel tree to validate it’s not specific to our software.

We have been testing a simple BPF program that swaps the MAC addresses around and transmits the packet back out of the same NIC. This can be seen here: on the test_zero_copy_tx branch, which has the very basic BPF program. The issue only occurs when testing the multi FCQ, it seems to work fine on a single FCQ. The issue also happens in copy mode and zero copy mode.

The command used was:

./xdpsock_multi --extra-stats --l2fwd --zero-copy --interface ice0 --channels=2 --busy-poll

It is my belief that this is a supported scenario, but I’m seeking some guidance to validate my thoughts, and ultimately whether this is a legitimate bug.

I hope this gives enough background and information for a reproducible issue. Any feedback is welcome and we look forward to hearing a response. :)
