On Wed, Dec 20, 2023 at 1:54 PM Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote: > > On Tue, 19 Dec 2023 at 21:18, Prashant Batra <prbatra.mail@xxxxxxxxx> wrote: > > > > Thanks for your response. My comments inline. > > > > On Tue, Dec 19, 2023 at 7:17 PM Magnus Karlsson > > <magnus.karlsson@xxxxxxxxx> wrote: > > > > > > On Tue, 19 Dec 2023 at 11:46, Prashant Batra <prbatra.mail@xxxxxxxxx> wrote: > > > > > > > > Hi, > > > > > > > > I am new to XDP and exploring it's working with different interface > > > > types supported in linux. One of my use cases is to be able to receive > > > > packets from the bond interface. > > > > I used xdpsock sample program specifying the bond interface as the > > > > input interface. However the packets received on the bond interface > > > > are not handed over to the socket by the kernel if the socket is bound > > > > in native mode. The packets are neither being passed to the kernel. > > > > Note that the socket creation does succeed. > > > > In skb mode this works and I am able to receive packets in the > > > > userspace. But in skb mode as expected the performance is not that > > > > great. > > > > > > > > Is AF_XDP sockets on bond not supported in native mode? Or since the > > > > packet has be to be handed over to the bond driver post reception on > > > > the phy port, a skb allocation and copy to it is indeed a must? > > > > > > I have never tried a bonding interface with AF_XDP, so it might not > > > work. Can you trace the packet to see where it is being dropped in > > > native mode? There are no modifications needed to an XDP_REDIRECT > > > enabled driver to support AF_XDP in XDP_DRV / copy mode. What NICs are > > > you using? > > > > > I will trace the packet and get back. > > The bond is over 2 physical ports part of the Intel NIC card. Those are- > > b3:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit > > SFI/SFP+ Network Connection (rev 01) > > b3:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit > > SFI/SFP+ Network Connection (rev 01) > > > > Bonding algo is 802.3ad > > > > CPU is Intel Xeon Gold 3.40GHz > > > > NIC Driver > > # ethtool -i ens1f0 > > driver: ixgbe > > version: 5.14.0-362.13.1.el9_3 > > Could you please try with the latest kernel 6.7? 5.14 is quite old and > a lot of things have happened since then. > I tried with kernel 6.6.8-1.el9.elrepo.x86_64. I still see the same issue. > > Features > > # xdp-loader features ens1f0 > > NETDEV_XDP_ACT_BASIC: yes > > NETDEV_XDP_ACT_REDIRECT: yes > > NETDEV_XDP_ACT_NDO_XMIT: no > > NETDEV_XDP_ACT_XSK_ZEROCOPY: yes > > NETDEV_XDP_ACT_HW_OFFLOAD: no > > NETDEV_XDP_ACT_RX_SG: no > > NETDEV_XDP_ACT_NDO_XMIT_SG: no > > > > CPU is > > > > Interesting thing is that the bond0 does advertise both native and ZC > > mode. That's because the features are copied from the slave device. > > Which explains why there is no error while binding the socket in > > native/zero-copy mode. > > It is probably the intention that if both the bonded devices support a > feature, then the bonding device will too. I just saw that the bonding > device did not implement xsk_wakeup which is used by zero-copy, so > zero-copy is not really supported so that support should not be > advertised. The code in AF_XDP tests for zero-copy support this way: > > if ((netdev->xdp_features & NETDEV_XDP_ACT_ZC) != NETDEV_XDP_ACT_ZC) { > err = -EOPNOTSUPP; > goto err_unreg_pool; > } > > So there are some things needed in the bonding driver to make > zero-copy work. Might not be much though. But your problem is with > XDP_DRV and copy mode, so let us start there. > > > void bond_xdp_set_features(struct net_device *bond_dev) > > { > > .. > > bond_for_each_slave(bond, slave, iter) > > val &= slave->dev->xdp_features; > > xdp_set_features_flag(bond_dev, val); > > } > > > > # ../xdp-loader/xdp-loader features bond0 > > NETDEV_XDP_ACT_BASIC: yes > > NETDEV_XDP_ACT_REDIRECT: yes > > NETDEV_XDP_ACT_NDO_XMIT: no > > NETDEV_XDP_ACT_XSK_ZEROCOPY: yes > > NETDEV_XDP_ACT_HW_OFFLOAD: no > > NETDEV_XDP_ACT_RX_SG: no > > NETDEV_XDP_ACT_NDO_XMIT_SG: no > > > > > > Another thing I notice is that other XDP programs attached to bond > > > > interface with targets like DROP, REDIRECT to other interface works > > > > and perform better than AF_XDP (skb) based. Does this mean that these > > > > are not allocating skb? > > > > > > I am not surprised that AF_XDP in copy is slower than XDP_REDIRECT. > > > The packet has to be copied out to user-space then copied into the > > > kernel again, something that is not needed in the XDP_REDIRECT case. > > > If you were using zero-copy, on the other hand, it would be faster > > > with AF_XDP. But the bonding interface does not support zero-copy, so > > > not an option. > > > > > > > Just to put forth the pps numbers with the above mentioned single port > > in different modes and a comparison to the bond interface. > > Test is using pktgen pumping 64 byte packets on a single flow. > > > > Single AF_XDP sock on a single NIC queue- > > AF_XDP rxdrop PPS CPU-SI* CPU-xdpsock Command > > ══════════════════════════════════════════════════════════ > > ZC 14M 65% 35% > > ./xdpsock -r -i ens1f0 -q 5 -p -n 1 -N -z > > XDP_DRV/COPY 10M 100% 23% ./xdpsock -r > > -i ens1f0 -q 5 -p -n 1 -N -c > > SKB_MODE 2.2M 100% 62% ./xdpsock > > -r -i ens1f0 -q 5 -p -n 1 -S > > * CPU receiving the packet > > In the above tests when using ZC and XDP_DRV/COPY, is this SI usage as > > expected? Especially in ZC mode. Is it majorly because of the BPF > > program running in non-HW offloaded mode? Don't have a NIC which can > > run BPF in offloaded mode so I cannot compare it. > > I get about 25 - 30 Mpps at 100% CPU load on my system, but I have a > 100G card and you are maxing out your 10G card at 65% and 14M. So yes, > sounds reasonable. HW offload cannot be used with AF_XDP. You need to > do the redirect in the CPU for it to work. If you want to know where > time is spent use "perf top". The biggest chunk of time is spent in > the XDP_REDIRECT operation, but there are many other time thiefs too. > > > The XDP_DROP target using xdp-bench tool (from xdp-tools) on the same NIC port- > > xdp-bench PPS CPU-SI* Command > > ═══════════════════════════════════════════════ > > drop, no-touch 14M 41% ./xdp-bench drop -p > > no-touch ens1f0 -e > > drop, read-data 14M 55% ./xdp-bench drop -p > > read-data ens1f0 -e > > drop, parse-ip 14M 58% ./xdp-bench drop -p > > parse-ip ens1f0 -e > > * CPU receiving the packet > > > > The similar tests on bond interface (above mentioned 2 ports bonded)- > > AF_XDP rxdrop PPS CPU-SI* CPU-xdpsock Command > > ══════════════════════════════════════════════════════════ > > ZC X X X > > ./xdpsock -r -i bond0 -q 0 -p -n 1 -N -z > > XDP_DRV/COPY X X X > > ./xdpsock -r -i bond0 -q 0 -p -n 1 -N -c > > SKB_MODE 2M 100% 55% ./xdpsock > > -r -i bond0 -q 0 -p -n 1 -S > > * CPU receiving the packet > > > > xdp-bench PPS CPU-SI* Command > > ═══════════════════════════════════════════════ > > drop, no-touch 10.9M 33% ./xdp-bench drop -p no-touch > > bond0 -e > > drop, read-data 10.9M 44% ./xdp-bench drop -p > > read-data bond0 -e > > drop, parse-ip 10.9M 47% ./xdp-bench drop -p > > parse-ip bond0 -e > > * CPU receiving the packet > > > > > > > > Kindly share your thoughts and advice. > > > > > > > > Thanks, > > > > Prashant > > > >