Re: AF_XDP integration with FDio VPP? (Was: Questions about XDP)

Magnus Karlsson <magnus.karlsson@xxxxxxxxx> · Fri, 11 Oct 2019 10:55:35 +0200

On Thu, Oct 10, 2019 at 11:30 AM Július Milan
<Julius.Milan@xxxxxxxxxxxxx> wrote:
>
> On 30 Sep 2019, at 13:02, Magnus Karlsson wrote:
>
> > On Mon, Sep 30, 2019 at 11:28 AM Eelco Chaudron <echaudro@xxxxxxxxxx>
> > wrote:
> >>
> >>
> >>
> >> On 30 Sep 2019, at 8:51, Magnus Karlsson wrote:
> >>
> >>> On Fri, Sep 27, 2019 at 8:09 PM William Tu <u9012063@xxxxxxxxx>
> >>> wrote:
> >>>>
> >>>> On Fri, Sep 27, 2019 at 12:02 AM Magnus Karlsson
> >>>> <magnus.karlsson@xxxxxxxxx> wrote:
> >>>>>
> >>>>> On Thu, Sep 26, 2019 at 1:34 AM William Tu <u9012063@xxxxxxxxx>
> >>>>> wrote:
> >>>>>>
> >>>>>> On Wed, Sep 25, 2019 at 12:48 AM Eelco Chaudron
> >>>>>> <echaudro@xxxxxxxxxx> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 25 Sep 2019, at 8:46, Július Milan wrote:
> >>>>>>>
> >>>>>>>> Hi Eelco
> >>>>>>>>
> >>>>>>>>> Currently, OVS uses the mmaped memory directly, however on
> >>>>>>>>> egress, it is copying the memory to the egress interface it’s
> >>>>>>>>> mmaped memory.
> >>>>>>>> Great, thanks for making this clear to me.
> >>>>>>>>
> >>>>>>>>> Currently, OVS uses an AF_XDP memory pool per interface, so a
> >>>>>>>>> further optimization could be to use a global memory pool so
> >>>>>>>>> this extra copy is not needed.
> >>>>>>>> Is it even possible to make this further optimization? Since
> >>>>>>>> every interface has it's own non-shared umem, so from my point
> >>>>>>>> of view, at least one copy for case as you described above
> >>>>>>>> (when RX interface is different then TX interface) is
> >>>>>>>> necessery. Or am I missing something?
> >>>>>>>
> >>>>>>> Some one @Intel told me it would be possible to have one huge
> >>>>>>> mempool that can be shared between interfaces. However I have
> >>>>>>> not researched/tried it.
> >>>>>>
> >>>>>> I thought about it before, but the problem is cq and fq are
> >>>>>> per-umem.
> >>>>>> So when having only one umem shared with many queues or devices,
> >>>>>> each one has to acquire a lock, then they can access cq or fq. I
> >>>>>> think that might become much slower.
> >>>>>
> >>>>> You basically have to implement a mempool that can be used by
> >>>>> multiple processes. Unfortunately, there is no lean and mean
> >>>>> standalone implementation of a mempool. There is a good one in
> >>>>> DPDK, but then you get the whole DPDK package into your
> >>>>> application which is likely what you wanted to avoid in the first
> >>>>> place. Anyone for writing libmempool?
> >>>>>
> >>>>> /Magnus
> >>>>>
> >>>>
> >>>> That's interesting.
> >>>> Do you mean the DPDK's rte_mempool which supports
> >>>> multiple-producer?
> >>>
> >>> Yes.
> >>>
> >>>> If I create a shared umem for queue1  and queue2, then each queue
> >>>> has its own tx/rx ring so they can process in parallel. But for
> >>>> handling the per-umem cq/fq, I can create a dedicated thread to
> >>>> process cq/fq.
> >>>> So for example:
> >>>> Thread 1 for handling cq/fq
> >>>> Thread 2 for processing queue1 tx/rx queue Thread 3 for processing
> >>>> queue2 tx/rx queue and the mempool should allow multiple producer
> >>>> and consumer.
> >>>>
> >>>> Does this sound correct?
> >>>
> >>> You do not need a dedicated process. Just something in the mempool
> >>> code that enforces mutual exclusion (a mutex or whatever) between
> >>> thread 2 and 3 when they are performing operations on the mempool.
> >>> Going with a dedicated process sounds complicated.
> >>
> >> I was trying to see how to experiment with this using libbpf, but
> >> looks like it’s not yet supported?
> >>
> >> Is see the following in xsk_socket__create():
> >>
> >> 475         if (umem->refcount) {
> >> 476                 pr_warning("Error: shared umems not supported by
> >> libbpf.\n");
> >> 477                 return -EBUSY;
> >> 478         }
> >>
> >
> > Using the XDP_SHARED_UMEM option is not supported in libbpf at this
> > point in time. In this mode you share a single umem with a single
> > completion queue and a single fill queue among many xsk sockets tied
> > to the same queue id. But note that you can register the same umem
> > area multiple times (creating multiple umem handles and multiple fqs
> > and cqs) to be able to support xsk sockets that have different queue
> > ids, but the same umem area. In both cases you need a mempool that can
> > handle multiple threads.
>
> Thinking about libmempool with umem shared among various independent processes, that would be great.
> So that multiple processes could share same NIC or even queue if registered with all necessary locking in libmempool.
> But what if one process crashes? Wondering how to achieve proper cleanup and if it is even possible with architecture I mentioned.
> Maybe with some monitoring thread, but that's complicated. Any ideas?

That is a correct observation. Dealing with failures when processes
share memory is hard. Much easier with a private memory model, but
that usually has negative performance implications. For some
inspiration, you can check "man pthread_mutexattr_getrobust" and how
robust mutexes can be use when a process holding the mutex dies. Makes
the problem a little bit more tractable.

> > The old xdpsock application prior to libbpf had support for the
> > XDP_SHARED_UMEM option. Take a look at that one if you would like to
> > experiment with it.
> >
> > /Magnus
>
> Július