On Thu, Aug 22, 2019 at 9:58 AM Ilya Maximets <i.maximets@xxxxxxxxxxx> wrote: > > On 22.08.2019 19:38, Alexander Duyck wrote: > > On Thu, Aug 22, 2019 at 5:30 AM Ilya Maximets <i.maximets@xxxxxxxxxxx> wrote: > >> > >> Tx code doesn't clear the descriptors' status after cleaning. > >> So, if the budget is larger than number of used elems in a ring, some > >> descriptors will be accounted twice and xsk_umem_complete_tx will move > >> prod_tail far beyond the prod_head breaking the comletion queue ring. > >> > >> Fix that by limiting the number of descriptors to clean by the number > >> of used descriptors in the tx ring. > >> > >> 'ixgbe_clean_xdp_tx_irq()' function refactored to look more like > >> 'ixgbe_xsk_clean_tx_ring()' since we don't need most of the > >> complications implemented in the regular 'ixgbe_clean_tx_irq()' > >> and we're allowed to directly use 'next_to_clean' and 'next_to_use' > >> indexes. > >> > >> Fixes: 8221c5eba8c1 ("ixgbe: add AF_XDP zero-copy Tx support") > >> Signed-off-by: Ilya Maximets <i.maximets@xxxxxxxxxxx> > >> --- > >> > >> Version 2: > >> * 'ixgbe_clean_xdp_tx_irq()' refactored to look more like > >> 'ixgbe_xsk_clean_tx_ring()'. > >> > >> drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 34 ++++++++------------ > >> 1 file changed, 13 insertions(+), 21 deletions(-) > >> > >> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c > >> index 6b609553329f..d1297660e14a 100644 > >> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c > >> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c > >> @@ -633,22 +633,23 @@ static void ixgbe_clean_xdp_tx_buffer(struct ixgbe_ring *tx_ring, > >> bool ixgbe_clean_xdp_tx_irq(struct ixgbe_q_vector *q_vector, > >> struct ixgbe_ring *tx_ring, int napi_budget) > >> { > >> + u16 ntc = tx_ring->next_to_clean, ntu = tx_ring->next_to_use; > >> unsigned int total_packets = 0, total_bytes = 0; > >> - u32 i = tx_ring->next_to_clean, xsk_frames = 0; > >> unsigned int budget = q_vector->tx.work_limit; > >> struct xdp_umem *umem = tx_ring->xsk_umem; > >> - union ixgbe_adv_tx_desc *tx_desc; > >> - struct ixgbe_tx_buffer *tx_bi; > >> + u32 xsk_frames = 0; > >> bool xmit_done; > >> > >> - tx_bi = &tx_ring->tx_buffer_info[i]; > >> - tx_desc = IXGBE_TX_DESC(tx_ring, i); > >> - i -= tx_ring->count; > >> + while (likely(ntc != ntu && budget)) { > > > > I would say you can get rid of budget entirely. It was only really > > needed for the regular Tx case where you can have multiple CPUs > > feeding a single Tx queue and causing a stall. Since we have a 1:1 > > mapping we should never have more than the Rx budget worth of packets > > to really process. In addition we can only make one pass through the > > ring since the ntu value is not updated while running the loop. > > OK. Will remove. > > > > >> + union ixgbe_adv_tx_desc *tx_desc; > >> + struct ixgbe_tx_buffer *tx_bi; > >> + > >> + tx_desc = IXGBE_TX_DESC(tx_ring, ntc); > >> > >> - do { > >> if (!(tx_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD))) > >> break; > >> > >> + tx_bi = &tx_ring->tx_buffer_info[ntc]; > > > > Please don't move this logic into the loop. We were intentionally > > processing this outside of the loop once and then just doing the > > increments because it is faster that way. It takes several operations > > to compute tx_bi based on ntc, whereas just incrementing is a single > > operation. > > OK. > > > > >> total_bytes += tx_bi->bytecount; > >> total_packets += tx_bi->gso_segs; > >> > >> @@ -659,24 +660,15 @@ bool ixgbe_clean_xdp_tx_irq(struct ixgbe_q_vector *q_vector, > >> > >> tx_bi->xdpf = NULL; > >> > >> - tx_bi++; > >> - tx_desc++; > >> - i++; > >> - if (unlikely(!i)) { > >> - i -= tx_ring->count; > > > > So these two lines can probably just be replaced by: > > if (unlikely(ntc == tx_ring->count)) { > > ntc = 0; > > Sure. > > > > >> - tx_bi = tx_ring->tx_buffer_info; > >> - tx_desc = IXGBE_TX_DESC(tx_ring, 0); > >> - } > >> - > >> - /* issue prefetch for next Tx descriptor */ > >> - prefetch(tx_desc); > > > > Did you just drop the prefetch? > > I'll keep the prefetch in v3 because, as you fairly mentioned, it's not > related to this patch. However, I'm not sure if this prefetch makes any > sense here, because there is only one comparison operation between the > prefetch and the data usage: > > while (ntc != ntu) { > if (!(tx_desc->wb.status ... > <...> > prefetch(tx_desc); > } I'm not opposed to dropping the prefetch, but if you are going to do it you should do it in a separate patch.