On Sun, Sep 29, 2024 at 10:45:06AM +0800, Wei Fang wrote: > When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC > on LS1028A, it was found that if the command was re-run multiple times, > Rx could not receive the frames, and the result of xdo-bench showed > that the rx rate was 0. > > root@ls1028ardb:~# ./xdp-bench tx eno0 > Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) > Summary 2046 rx/s 0 err,drop/s > Summary 0 rx/s 0 err,drop/s > Summary 0 rx/s 0 err,drop/s > Summary 0 rx/s 0 err,drop/s > > By observing the Rx PIR and CIR registers, we found that CIR is always > equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring > is full and can no longer accommodate other Rx frames. Therefore, we > can conclude that the problem is caused by the Rx BD ring not being > cleaned up. > > Further analysis of the code revealed that the Rx BD ring will only > be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. > Therefore, some debug logs were added to the driver and the current > values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx > BD ring was full. The logs are as follows. > > [ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 > [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 > [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110 > > From the results, we can see that the max value of xdp_tx_in_flight > has reached 2140. However, the size of the Rx BD ring is only 2048. > This is incredible, so we checked the code again and found that > xdp_tx_in_flight did not drop to 0 when the bpf program was uninstalled > and it was not reset when the bfp program was installed again. The > root cause is that the IRQ is disabled too early in enetc_stop(), > resulting in enetc_recycle_xdp_tx_buff() not being called, therefore, > xdp_tx_in_flight is not cleared. > > Fixes: ff58fda09096 ("net: enetc: prioritize ability to go down over packet processing") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Wei Fang <wei.fang@xxxxxxx> > --- > v2 changes: > 1. Modify the titile and rephrase the commit meesage. > 2. Use the new solution as described in the title > --- I gave this another test under a bit different set of circumstances this time, and I'm confident that there are still problems, which I haven't identified though (yet). With 64 byte frames at 2.5 Gbps, I see this going on: $ xdp-bench tx eno0 & $ while :; do taskset $((1 << 0)) hwstamp_ctl -i eno0 -r 1 && sleep 1 && taskset $((1 << 0)) hwstamp_ctl -i eno0 -r 0 && sleep 1; done current settings: tx_type 0 rx_filter 0 new settings: tx_type 0 rx_filter 1 Summary 1,556,952 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s current settings: tx_type 0 rx_filter 1 Summary 0 rx/s 0 err,drop/s [ 883.780346] fsl_enetc 0000:00:00.0 eno0: timeout for tx ring #6 clear (its RX ring has 2072 XDP_TX frames in flight) new settings: tx_type 0 rx_filter 0 Summary 1,027 rx/s 0 err,drop/s current settings: tx_type 0 rx_filter 0 Summary 0 rx/s 0 err,drop/s which looks like the symptoms that the patch tries to solve. My previous testing was with 390 byte frames, and this did not happen. Please do not merge this.