On Thu, Oct 10, 2024 at 05:20:56PM +0800, Wei Fang wrote: > When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC > on LS1028A, it was found that if the command was re-run multiple times, > Rx could not receive the frames, and the result of xdp-bench showed > that the rx rate was 0. > > root@ls1028ardb:~# ./xdp-bench tx eno0 > Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) > Summary 2046 rx/s 0 err,drop/s > Summary 0 rx/s 0 err,drop/s > Summary 0 rx/s 0 err,drop/s > Summary 0 rx/s 0 err,drop/s > > By observing the Rx PIR and CIR registers, CIR is always 0x7FF and > PIR is always 0x7FE, which means that the Rx ring is full and can no > longer accommodate other Rx frames. Therefore, the problem is caused > by the Rx BD ring not being cleaned up. > > Further analysis of the code revealed that the Rx BD ring will only > be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. > Therefore, some debug logs were added to the driver and the current > values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx > BD ring was full. The logs are as follows. > > [ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 > [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 > [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110 > > From the results, the max value of xdp_tx_in_flight has reached 2140. > However, the size of the Rx BD ring is only 2048. So xdp_tx_in_flight > did not drop to 0 after enetc_stop() is called and the driver does not > clear it. The root cause is that NAPI is disabled too aggressively, > without having waited for the pending XDP_TX frames to be transmitted, > and their buffers recycled, so that xdp_tx_in_flight cannot naturally > drop to 0. Later, enetc_free_tx_ring() does free those stale, unsent > XDP_TX packets, but it is not coded up to also reset xdp_tx_in_flight, > hence the manifestation of the bug. > > One option would be to cover this extra condition in enetc_free_tx_ring(), > but now that the ENETC_TX_DOWN exists, we have created a window at > the beginning of enetc_stop() where NAPI can still be scheduled, but > any concurrent enqueue will be blocked. Therefore, enetc_wait_bdrs() > and enetc_disable_tx_bdrs() can be called with NAPI still scheduled, > and it is guaranteed that this will not wait indefinitely, but instead > give us an indication that the pending TX frames have orderly dropped > to zero. Only then should we call napi_disable(). > > This way, enetc_free_tx_ring() becomes entirely redundant and can be > dropped as part of subsequent cleanup. > > The change also refactors enetc_start() so that it looks like the > mirror opposite procedure of enetc_stop(). > > Fixes: ff58fda09096 ("net: enetc: prioritize ability to go down over packet processing") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Wei Fang <wei.fang@xxxxxxx> > --- > v2 changes: > 1. Modify the titile and rephrase the commit meesage. > 2. Use the new solution as described in the title > v3: no changes. > v4 changes: > 1. Modify the title and rephrase the commit message. > --- Reviewed-by: Vladimir Oltean <vladimir.oltean@xxxxxxx> Tested-by: Vladimir Oltean <vladimir.oltean@xxxxxxx>