On Fri, 26 Mar 2021 at 00:36, Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote: > > On Thu, Mar 25, 2021 at 7:51 PM Konstantinos Kaffes <kkaffes@xxxxxxxxx> wrote: > > > > Great, thanks for the info! I will look into implementing this. > > > > For the time being, I implemented a version of my design with N^2 > > sockets. I observed that when all traffic is directed to a single NIC > > queue, the throughput is higher than when I use all N NIC queues. I am > > using spinlocks to guard concurrent access to UMEM and the > > fill/completion rings. When I use a single NIC queue, I achieve > > ~1Mpps; when I use multiple ~550Kpps. Are these numbers reasonable, > > and this bad scaling behavior expected? > > 1Mpps sounds reasonable with SKB mode. If you use something simple > like the spinlock scheme you describe, then it will not scale. Check > the sample xsk_fwd.c in samples/bpf in the Linux kernel repo. It has a > mempool implementation that should scale better than the one you > implemented. For anything remotely complicated, something that manages > the buffers in the umem plus the fill and completion queues is usually > required. This is called a mempool most of the time. User-space > network libraries such as DPDK and VPP provide fast and scalable > mempool implementations. It would be nice to add a simple one to > libbpf, or rather libxdp as the AF_XDP functionality is moving over > there. Several people have asked for it, but unfortunately I have not > had the time. > Thanks for the tip! I have also started trying zero-copy DRV mode and came across a weird behavior. When I am using multiple sockets, one for each NIC queue, I observe very low throughput and a lot of time spent on the following loop: uint32_t idx_cq; while (ret < buf_count) { ret += xsk_ring_cons__peek(&xsk->umem->cq, buf_count, &idx_cq); } This does not happen when I have only one XDP socket bound to a single queue. Any idea on why this might be happening? > > > > On Thu, 25 Mar 2021 at 00:24, Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote: > > > > > > On Thu, Mar 25, 2021 at 7:25 AM Konstantinos Kaffes <kkaffes@xxxxxxxxx> wrote: > > > > > > > > Hello everyone, > > > > > > > > I want to write a multi-threaded AF_XDP server where all N threads can > > > > read from all N NIC queues. In my design, each thread creates N AF_XDP > > > > sockets, each associated with a different queue. I have the following > > > > questions: > > > > > > > > 1. Do sockets associated with the same queue need to share their UMEM > > > > area and fill and completion rings? > > > > > > Yes. In zero-copy mode this is natural since the NIC HW will DMA the > > > packet into a umem that was decided long before the packet was even > > > received. And this is of course before we even get to pick what socket > > > it should go to. This restriction is currently carried over to > > > copy-mode, however, in theory there is nothing preventing supporting > > > multiple umems on the same netdev and queue id in copy-mode. It is > > > just that nobody has implemented support for it. > > > > > > > 2. Will there be a single XSKMAP holding all N^2 sockets? If yes, what > > > > happens if my XDP program redirects a packet to a socket that is > > > > associated with a different NIC queue than the one in which the packet > > > > arrived? > > > > > > You can have multiple XSKMAPs but you would in any case have to have > > > N^2 sockets in total to be able to cover all cases. Sockets are tied > > > to a specific netdev and queue id. If you try to redirect to socket > > > with a queue id or netdev that the packet was not received on, it will > > > be dropped. Again, for copy-mode, it would from a theoretical > > > perspective be perfectly fine to redirect to another queue id and/or > > > netdev since the packet is copied anyway. Maybe you want to add > > > support for it :-). > > > > > > > I must mention that I am using the XDP skb mode with copies. > > > > > > > > Thank you in advance, > > > > Kostis > > > > > > > > -- > > Kostis Kaffes > > PhD Student in Electrical Engineering > > Stanford University -- Kostis Kaffes PhD Student in Electrical Engineering Stanford University