Re: AF_XDP sockets across multiple NIC queues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 26 Mar 2021 at 00:36, Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote:
>
> On Thu, Mar 25, 2021 at 7:51 PM Konstantinos Kaffes <kkaffes@xxxxxxxxx> wrote:
> >
> > Great, thanks for the info! I will look into implementing this.
> >
> > For the time being, I implemented a version of my design with N^2
> > sockets. I observed that when all traffic is directed to a single NIC
> > queue, the throughput is higher than when I use all N NIC queues. I am
> > using spinlocks to guard concurrent access to UMEM and the
> > fill/completion rings. When I use a single NIC queue, I achieve
> > ~1Mpps; when I use multiple ~550Kpps. Are these numbers reasonable,
> > and this bad scaling behavior expected?
>
> 1Mpps sounds reasonable with SKB mode. If you use something simple
> like the spinlock scheme you describe, then it will not scale. Check
> the sample xsk_fwd.c in samples/bpf in the Linux kernel repo. It has a
> mempool implementation that should scale better than the one you
> implemented. For anything remotely complicated, something that manages
> the buffers in the umem plus the fill and completion queues is usually
> required. This is called a mempool most of the time. User-space
> network libraries such as DPDK and VPP provide fast and scalable
> mempool implementations. It would be nice to add a simple one to
> libbpf, or rather libxdp as the AF_XDP functionality is moving over
> there. Several people have asked for it, but unfortunately I have not
> had the time.
>

Thanks for the tip! I have also started trying zero-copy DRV mode and
came across a weird behavior. When I am using multiple sockets, one
for each NIC queue, I observe very low throughput and a lot of time
spent on the following loop:

uint32_t idx_cq;
while (ret < buf_count) {
  ret += xsk_ring_cons__peek(&xsk->umem->cq, buf_count, &idx_cq);
}

This does not happen when I have only one XDP socket bound to a single queue.

Any idea on why this might be happening?

> >
> > On Thu, 25 Mar 2021 at 00:24, Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote:
> > >
> > > On Thu, Mar 25, 2021 at 7:25 AM Konstantinos Kaffes <kkaffes@xxxxxxxxx> wrote:
> > > >
> > > > Hello everyone,
> > > >
> > > > I want to write a multi-threaded AF_XDP server where all N threads can
> > > > read from all N NIC queues. In my design, each thread creates N AF_XDP
> > > > sockets, each associated with a different queue. I have the following
> > > > questions:
> > > >
> > > > 1. Do sockets associated with the same queue need to share their UMEM
> > > > area and fill and completion rings?
> > >
> > > Yes. In zero-copy mode this is natural since the NIC HW will DMA the
> > > packet into a umem that was decided long before the packet was even
> > > received. And this is of course before we even get to pick what socket
> > > it should go to. This restriction is currently carried over to
> > > copy-mode, however, in theory there is nothing preventing supporting
> > > multiple umems on the same netdev and queue id in copy-mode. It is
> > > just that nobody has implemented support for it.
> > >
> > > > 2. Will there be a single XSKMAP holding all N^2 sockets? If yes, what
> > > > happens if my XDP program redirects a packet to a socket that is
> > > > associated with a different NIC queue than the one in which the packet
> > > > arrived?
> > >
> > > You can have multiple XSKMAPs but you would in any case have to have
> > > N^2 sockets in total to be able to cover all cases. Sockets are tied
> > > to a specific netdev and queue id. If you try to redirect to socket
> > > with a queue id or netdev that the packet was not received on, it will
> > > be dropped. Again, for copy-mode, it would from a theoretical
> > > perspective be perfectly fine to redirect to another queue id and/or
> > > netdev since the packet is copied anyway. Maybe you want to add
> > > support for it :-).
> > >
> > > > I must mention that I am using the XDP skb mode with copies.
> > > >
> > > > Thank you in advance,
> > > > Kostis
> >
> >
> >
> > --
> > Kostis Kaffes
> > PhD Student in Electrical Engineering
> > Stanford University



-- 
Kostis Kaffes
PhD Student in Electrical Engineering
Stanford University



[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux