Re: AF_XDP integration with FDio VPP? (Was: Questions about XDP)

William Tu <u9012063@xxxxxxxxx> · Mon, 30 Sep 2019 15:49:26 -0700

On Mon, Sep 30, 2019 at 5:17 AM Eelco Chaudron <echaudro@xxxxxxxxxx> wrote:
>
>
>
> On 30 Sep 2019, at 13:02, Magnus Karlsson wrote:
>
> > On Mon, Sep 30, 2019 at 11:28 AM Eelco Chaudron <echaudro@xxxxxxxxxx>
> > wrote:
> >>
> >>
> >>
> >> On 30 Sep 2019, at 8:51, Magnus Karlsson wrote:
> >>
> >>> On Fri, Sep 27, 2019 at 8:09 PM William Tu <u9012063@xxxxxxxxx>
> >>> wrote:
> >>>>
> >>>> On Fri, Sep 27, 2019 at 12:02 AM Magnus Karlsson
> >>>> <magnus.karlsson@xxxxxxxxx> wrote:
> >>>>>
> >>>>> On Thu, Sep 26, 2019 at 1:34 AM William Tu <u9012063@xxxxxxxxx>
> >>>>> wrote:
> >>>>>>
> >>>>>> On Wed, Sep 25, 2019 at 12:48 AM Eelco Chaudron
> >>>>>> <echaudro@xxxxxxxxxx> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 25 Sep 2019, at 8:46, Július Milan wrote:
> >>>>>>>
> >>>>>>>> Hi Eelco
> >>>>>>>>
> >>>>>>>>> Currently, OVS uses the mmaped memory directly, however on
> >>>>>>>>> egress, it
> >>>>>>>>> is copying the memory to the egress interface it’s mmaped
> >>>>>>>>> memory.
> >>>>>>>> Great, thanks for making this clear to me.
> >>>>>>>>
> >>>>>>>>> Currently, OVS uses an AF_XDP memory pool per interface, so a
> >>>>>>>>> further
> >>>>>>>>> optimization could be to use a global memory pool so this
> >>>>>>>>> extra
> >>>>>>>>> copy
> >>>>>>>>> is not needed.
> >>>>>>>> Is it even possible to make this further optimization? Since
> >>>>>>>> every
> >>>>>>>> interface has it's own non-shared umem, so from my point of
> >>>>>>>> view,
> >>>>>>>> at
> >>>>>>>> least one
> >>>>>>>> copy for case as you described above (when RX interface is
> >>>>>>>> different
> >>>>>>>> then TX interface) is necessery. Or am I missing something?
> >>>>>>>
> >>>>>>> Some one @Intel told me it would be possible to have one huge
> >>>>>>> mempool
> >>>>>>> that can be shared between interfaces. However I have not
> >>>>>>> researched/tried it.
> >>>>>>
> >>>>>> I thought about it before, but the problem is cq and fq are
> >>>>>> per-umem.
> >>>>>> So when having only one umem shared with many queues or devices,
> >>>>>> each one has to acquire a lock, then they can access cq or fq. I
> >>>>>> think
> >>>>>> that might become much slower.
> >>>>>
> >>>>> You basically have to implement a mempool that can be used by
> >>>>> multiple
> >>>>> processes. Unfortunately, there is no lean and mean standalone
> >>>>> implementation of a mempool. There is a good one in DPDK, but then
> >>>>> you
> >>>>> get the whole DPDK package into your application which is likely
> >>>>> what
> >>>>> you wanted to avoid in the first place. Anyone for writing
> >>>>> libmempool?
> >>>>>
> >>>>> /Magnus
> >>>>>
> >>>>
> >>>> That's interesting.
> >>>> Do you mean the DPDK's rte_mempool which supports
> >>>> multiple-producer?
> >>>
> >>> Yes.
> >>>
> >>>> If I create a shared umem for queue1  and queue2, then each queue
> >>>> has
> >>>> its
> >>>> own tx/rx ring so they can process in parallel. But for handling
> >>>> the
> >>>> per-umem
> >>>> cq/fq, I can create a dedicated thread to process cq/fq.
> >>>> So for example:
> >>>> Thread 1 for handling cq/fq
> >>>> Thread 2 for processing queue1 tx/rx queue
> >>>> Thread 3 for processing queue2 tx/rx queue
> >>>> and the mempool should allow multiple producer and consumer.
> >>>>
> >>>> Does this sound correct?
> >>>
> >>> You do not need a dedicated process. Just something in the mempool
> >>> code that enforces mutual exclusion (a mutex or whatever) between
> >>> thread 2 and 3 when they are performing operations on the mempool.
> >>> Going with a dedicated process sounds complicated.
> >>
> >> I was trying to see how to experiment with this using libbpf, but
> >> looks
> >> like it’s not yet supported?
> >>
> >> Is see the following in xsk_socket__create():
> >>
> >> 475         if (umem->refcount) {
> >> 476                 pr_warning("Error: shared umems not supported by
> >> libbpf.\n");
> >> 477                 return -EBUSY;
> >> 478         }
> >>
> >
> > Using the XDP_SHARED_UMEM option is not supported in libbpf at this
> > point in time. In this mode you share a single umem with a single
> > completion queue and a single fill queue among many xsk sockets tied
> > to the same queue id. But note that you can register the same umem
> > area multiple times (creating multiple umem handles and multiple fqs
> > and cqs) to be able to support xsk sockets that have different queue
> > ids, but the same umem area. In both cases you need a mempool that can
> > handle multiple threads.
>
> Cool, this was not clear, and is what would fit better than the shared
> fqs/cqs.
>
> William this would be an interesting option for OVS to support zero
> memcpy on tx.

Great, much clear to me now. I will take a look!

William