Best Regards, Wei Fang Hi Vladimir, > > > > On Sun, Sep 29, 2024 at 10:45:06AM +0800, Wei Fang wrote: > > > When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC > > > on LS1028A, it was found that if the command was re-run multiple times, > > > Rx could not receive the frames, and the result of xdo-bench showed > > > that the rx rate was 0. > > > > > > root@ls1028ardb:~# ./xdp-bench tx eno0 > > > Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) > > > Summary 2046 rx/s 0 > > err,drop/s > > > Summary 0 rx/s 0 > > err,drop/s > > > Summary 0 rx/s 0 > > err,drop/s > > > Summary 0 rx/s 0 > > err,drop/s > > > > > > By observing the Rx PIR and CIR registers, we found that CIR is always > > > equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring > > > is full and can no longer accommodate other Rx frames. Therefore, we > > > can conclude that the problem is caused by the Rx BD ring not being > > > cleaned up. > > > > > > Further analysis of the code revealed that the Rx BD ring will only > > > be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. > > > Therefore, some debug logs were added to the driver and the current > > > values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx > > > BD ring was full. The logs are as follows. > > > > > > [ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 > > > [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 > > > [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110 > > > > > > From the results, we can see that the max value of xdp_tx_in_flight > > > has reached 2140. However, the size of the Rx BD ring is only 2048. > > > This is incredible, so we checked the code again and found that > > > xdp_tx_in_flight did not drop to 0 when the bpf program was uninstalled > > > and it was not reset when the bfp program was installed again. The > > > root cause is that the IRQ is disabled too early in enetc_stop(), > > > resulting in enetc_recycle_xdp_tx_buff() not being called, therefore, > > > xdp_tx_in_flight is not cleared. > > > > > > Fixes: ff58fda09096 ("net: enetc: prioritize ability to go down over packet > > processing") > > > Cc: stable@xxxxxxxxxxxxxxx > > > Signed-off-by: Wei Fang <wei.fang@xxxxxxx> > > > --- > > > v2 changes: > > > 1. Modify the titile and rephrase the commit meesage. > > > 2. Use the new solution as described in the title > > > --- > > > > I gave this another test under a bit different set of circumstances this time, > > and I'm confident that there are still problems, which I haven't identified > > though (yet). > > > > With 64 byte frames at 2.5 Gbps, I see this going on: > > > > $ xdp-bench tx eno0 & > > $ while :; do taskset $((1 << 0)) hwstamp_ctl -i eno0 -r 1 && sleep 1 && > taskset > > $((1 << 0)) hwstamp_ctl -i eno0 -r 0 && sleep 1; done > > current settings: > > tx_type 0 > > rx_filter 0 > > new settings: > > tx_type 0 > > rx_filter 1 > > Summary 1,556,952 rx/s 0 > err,drop/s > > Summary 0 rx/s 0 > err,drop/s > > Summary 0 rx/s 0 > err,drop/s > > current settings: > > tx_type 0 > > rx_filter 1 > > Summary 0 rx/s 0 > err,drop/s > > [ 883.780346] fsl_enetc 0000:00:00.0 eno0: timeout for tx ring #6 clear (its > > RX ring has 2072 XDP_TX frames in flight) > > new settings: > > tx_type 0 > > rx_filter 0 > > Summary 1,027 rx/s 0 > err,drop/s > > current settings: > > tx_type 0 > > rx_filter 0 > > Summary 0 rx/s 0 > err,drop/s > > > > which looks like the symptoms that the patch tries to solve. > > > > My previous testing was with 390 byte frames, and this did not happen. > > > > Please do not merge this. > > Oh, it looks like there are still some issues we don't know about. I did > test using 64 bytes but not at that high of a rate. Also I didn't turn on > timestamp. Anyway, I will try to reproduce the issue when I'm back to > office next Tuesday. It would be nice if you can help find the root cause > before next Tuesday, thanks! I think the reason is that Rx BDRs are disabled when enetc_stop() is called, but there are still many unprocessed frames on Rx BDR. These frames will be processed by XDP program and put into Tx BDR. So enetc_wait_txbdr() will timeout and cause xdp_tx_in_flight will not be cleared. So based on this patch, we should add a separate patch, similar to the patch 2 ("net: enetc: fix the issues of XDP_REDIRECT feature "), which prevents the XDP_TX frames from being put into Tx BDRs when the ENETC_TX_DOWN flag is set. The new patch is shown below. After adding this new patch, I followed your test steps and tested for more than 30 minutes, and the issue cannot be reproduced anymore (without this patch, this problem would be reproduced within seconds). --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1606,6 +1606,12 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, break; case XDP_TX: tx_ring = priv->xdp_tx_ring[rx_ring->index]; + if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags))) { + enetc_xdp_drop(rx_ring, orig_i, i); + tx_ring->stats.xdp_tx_drops++; + break; + } + xdp_tx_bd_cnt = enetc_rx_swbd_to_xdp_tx_swbd(xdp_tx_arr, rx_ring, orig_i, i);