> -----Original Message----- > From: Vladimir Oltean <vladimir.oltean@xxxxxxx> > Sent: 2024年10月9日 20:10 > To: Wei Fang <wei.fang@xxxxxxx> > Cc: davem@xxxxxxxxxxxxx; edumazet@xxxxxxxxxx; kuba@xxxxxxxxxx; > pabeni@xxxxxxxxxx; Claudiu Manoil <claudiu.manoil@xxxxxxx>; > ast@xxxxxxxxxx; daniel@xxxxxxxxxxxxx; hawk@xxxxxxxxxx; > john.fastabend@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > netdev@xxxxxxxxxxxxxxx; bpf@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx; > imx@xxxxxxxxxxxxxxx; rkannoth@xxxxxxxxxxx; maciej.fijalkowski@xxxxxxxxx; > sbhatta@xxxxxxxxxxx > Subject: Re: [PATCH v3 net 3/3] net: enetc: disable IRQ after Rx and Tx BD rings > are disabled > > On Wed, Oct 09, 2024 at 05:03:27PM +0800, Wei Fang wrote: > > When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC > > on LS1028A, it was found that if the command was re-run multiple times, > > Rx could not receive the frames, and the result of xdo-bench showed > > xdp-bench > > > that the rx rate was 0. > > > > root@ls1028ardb:~# ./xdp-bench tx eno0 > > Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) > > Summary 2046 rx/s 0 > err,drop/s > > Summary 0 rx/s 0 > err,drop/s > > Summary 0 rx/s 0 > err,drop/s > > Summary 0 rx/s 0 > err,drop/s > > > > By observing the Rx PIR and CIR registers, we found that CIR is always > > equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring > > is full and can no longer accommodate other Rx frames. Therefore, we > > can conclude that the problem is caused by the Rx BD ring not being > > cleaned up. > > > > Further analysis of the code revealed that the Rx BD ring will only > > be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. > > Therefore, some debug logs were added to the driver and the current > > values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx > > BD ring was full. The logs are as follows. > > > > [ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 > > [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 > > [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110 > > > > From the results, we can see that the max value of xdp_tx_in_flight > > has reached 2140. However, the size of the Rx BD ring is only 2048. > > This is incredible, so we checked the code again and found that > > xdp_tx_in_flight did not drop to 0 when the bpf program was uninstalled > > and it was not reset when the bfp program was installed again. > > Please make it clear that this is more general and it happens whenever > enetc_stop() is called. > > > The root cause is that the IRQ is disabled too early in enetc_stop(), > > resulting in enetc_recycle_xdp_tx_buff() not being called, therefore, > > xdp_tx_in_flight is not cleared. > > I feel that the problem is not so much the IRQ, as the NAPI (softirq), > really. Under heavy traffic we don't even get that many hardirqs (if any), > but NAPI just reschedules itself because of the budget which constantly > gets exceeded. Please make this also clear in the commit title, > something like "net: enetc: disable NAPI only after TX rings are empty". > > I would restate the problem as: "The root cause is that we disable NAPI > too aggressively, without having waited for the pending XDP_TX frames to > be transmitted, and their buffers recycled, so that the xdp_tx_in_flight > counter can naturally drop to zero. Later, enetc_free_tx_ring() does > free those stale, untransmitted XDP_TX packets, but it is not coded up > to also reset the xdp_tx_in_flight counter, hence the manifestation of > the bug." > > And then we should have a paragraph that describes the solution as well. > "One option would be to cover this extra condition in enetc_free_tx_ring(), > but now that the ENETC_TX_DOWN exists, we have created a window at the > beginning of enetc_stop() where NAPI can still be scheduled, but any > concurrent enqueue will be blocked. Therefore, we can call enetc_wait_bdrs() > and enetc_disable_tx_bdrs() with NAPI still scheduled, and it is > guaranteed that this will not wait indefinitely, but instead give us an > indication that the pending TX frames have orderly dropped to zero. > Only then should we call napi_disable(). > > This way, enetc_free_tx_ring() becomes entirely redundant and can be > dropped as part of subsequent cleanup. > > The change also refactors enetc_start() so that it looks like the mirror > opposite procedure of enetc_stop()." > > I think describing the problem and solution in these terms gives the > reviewers more versed in NAPI a better chance of understanding what is > going on and what we are trying to achieve. Thanks for helping rephrase the commit message, I will applying it to the next version.