On Thu, Dec 10, 2020 at 7:36 AM Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote: > > From: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> > > Fix a race when multiple sockets are simultaneously calling sendto() > when the completion ring is shared in the SKB case. This is the case > when you share the same netdev and queue id through the > XDP_SHARED_UMEM bind flag. The problem is that multiple processes can > be in xsk_generic_xmit() and call the backpressure mechanism in > xskq_prod_reserve(xs->pool->cq). As this is a shared resource in this > specific scenario, a race might occur since the rings are > single-producer single-consumer. > > Fix this by moving the tx_completion_lock from the socket to the pool > as the pool is shared between the sockets that share the completion > ring. (The pool is not shared when this is not the case.) And then > protect the accesses to xskq_prod_reserve() with this lock. The > tx_completion_lock is renamed cq_lock to better reflect that it > protects accesses to the potentially shared completion ring. > > Fixes: 35fcde7f8deb ("xsk: support for Tx") > Fixes: a9744f7ca200 ("xsk: fix potential race in SKB TX completion code") > Signed-off-by: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> > Reported-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> > --- > include/net/xdp_sock.h | 4 ---- > include/net/xsk_buff_pool.h | 5 +++++ > net/xdp/xsk.c | 9 ++++++--- > net/xdp/xsk_buff_pool.c | 1 + > 4 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h > index 4f4e93bf814c..cc17bc957548 100644 > --- a/include/net/xdp_sock.h > +++ b/include/net/xdp_sock.h > @@ -58,10 +58,6 @@ struct xdp_sock { > > struct xsk_queue *tx ____cacheline_aligned_in_smp; > struct list_head tx_list; > - /* Mutual exclusion of NAPI TX thread and sendmsg error paths > - * in the SKB destructor callback. > - */ > - spinlock_t tx_completion_lock; > /* Protects generic receive. */ > spinlock_t rx_lock; > > diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h > index 01755b838c74..eaa8386dbc63 100644 > --- a/include/net/xsk_buff_pool.h > +++ b/include/net/xsk_buff_pool.h > @@ -73,6 +73,11 @@ struct xsk_buff_pool { > bool dma_need_sync; > bool unaligned; > void *addrs; > + /* Mutual exclusion of the completion ring in the SKB mode. Two cases to protect: > + * NAPI TX thread and sendmsg error paths in the SKB destructor callback and when > + * sockets share a single cq when the same netdev and queue id is shared. > + */ > + spinlock_t cq_lock; > struct xdp_buff_xsk *free_heads[]; > }; > > diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c > index 62504471fd20..42cb5f94d49e 100644 > --- a/net/xdp/xsk.c > +++ b/net/xdp/xsk.c > @@ -364,9 +364,9 @@ static void xsk_destruct_skb(struct sk_buff *skb) > struct xdp_sock *xs = xdp_sk(skb->sk); > unsigned long flags; > > - spin_lock_irqsave(&xs->tx_completion_lock, flags); > + spin_lock_irqsave(&xs->pool->cq_lock, flags); > xskq_prod_submit_addr(xs->pool->cq, addr); > - spin_unlock_irqrestore(&xs->tx_completion_lock, flags); > + spin_unlock_irqrestore(&xs->pool->cq_lock, flags); > > sock_wfree(skb); > } > @@ -378,6 +378,7 @@ static int xsk_generic_xmit(struct sock *sk) > bool sent_frame = false; > struct xdp_desc desc; > struct sk_buff *skb; > + unsigned long flags; > int err = 0; > > mutex_lock(&xs->mutex); > @@ -409,10 +410,13 @@ static int xsk_generic_xmit(struct sock *sk) > * if there is space in it. This avoids having to implement > * any buffering in the Tx path. > */ > + spin_lock_irqsave(&xs->pool->cq_lock, flags); > if (unlikely(err) || xskq_prod_reserve(xs->pool->cq)) { > + spin_unlock_irqrestore(&xs->pool->cq_lock, flags); > kfree_skb(skb); > goto out; > } > + spin_unlock_irqrestore(&xs->pool->cq_lock, flags); Lock/unlock for every packet? Do you have any performance concerns?