From: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> Rollback the reservation in the completion ring when we get a NETDEV_TX_BUSY. When this error is received from the driver, we are supposed to let the user application retry the transmit again. And in order to do this, we need to roll back the failed send so it can be retried. Unfortunately, we did not cancel the reservation we had made in the completion ring. By not doing this, we actually make the completion ring one entry smaller per NETDEV_TX_BUSY error we get, and after enough of these errors the completion ring will be of size zero and transmit will stop working. Fix this by cancelling the reservation when we get a NETDEV_TX_BUSY error. Fixes: 642e450b6b59 ("xsk: Do not discard packet when NETDEV_TX_BUSY") Signed-off-by: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> Reported-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> --- net/xdp/xsk.c | 3 +++ net/xdp/xsk_queue.h | 5 +++++ 2 files changed, 8 insertions(+) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 42cb5f94d49e..2587583a6be9 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -428,6 +428,9 @@ static int xsk_generic_xmit(struct sock *sk) if (err == NETDEV_TX_BUSY) { /* Tell user-space to retry the send */ skb->destructor = sock_wfree; + spin_lock_irqsave(&xs->pool->cq_lock, flags); + xskq_prod_cancel(xs->pool->cq); + spin_unlock_irqrestore(&xs->pool->cq_lock, flags); /* Free skb without triggering the perf drop trace */ consume_skb(skb); err = -EAGAIN; diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 9e71b9f27679..ef6de0fb4e31 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -286,6 +286,11 @@ static inline bool xskq_prod_is_full(struct xsk_queue *q) return !free_entries; } +static inline void xskq_prod_cancel(struct xsk_queue *q) +{ + q->cached_prod--; +} + static inline int xskq_prod_reserve(struct xsk_queue *q) { if (xskq_prod_is_full(q)) -- 2.29.0