Re: AF_XDP integration with FDio VPP? (Was: Questions about XDP)

Magnus Karlsson <magnus.karlsson@xxxxxxxxx> · Mon, 30 Sep 2019 08:51:40 +0200

On Fri, Sep 27, 2019 at 8:09 PM William Tu <u9012063@xxxxxxxxx> wrote:
>
> On Fri, Sep 27, 2019 at 12:02 AM Magnus Karlsson
> <magnus.karlsson@xxxxxxxxx> wrote:
> >
> > On Thu, Sep 26, 2019 at 1:34 AM William Tu <u9012063@xxxxxxxxx> wrote:
> > >
> > > On Wed, Sep 25, 2019 at 12:48 AM Eelco Chaudron <echaudro@xxxxxxxxxx> wrote:
> > > >
> > > >
> > > >
> > > > On 25 Sep 2019, at 8:46, Július Milan wrote:
> > > >
> > > > > Hi Eelco
> > > > >
> > > > >> Currently, OVS uses the mmaped memory directly, however on egress, it
> > > > >> is copying the memory to the egress interface it’s mmaped memory.
> > > > > Great, thanks for making this clear to me.
> > > > >
> > > > >> Currently, OVS uses an AF_XDP memory pool per interface, so a further
> > > > >> optimization could be to use a global memory pool so this extra copy
> > > > >> is not needed.
> > > > > Is it even possible to make this further optimization? Since every
> > > > > interface has it's own non-shared umem, so from my point of view, at
> > > > > least one
> > > > > copy for case as you described above (when RX interface is different
> > > > > then TX interface) is necessery. Or am I missing something?
> > > >
> > > > Some one @Intel told me it would be possible to have one huge mempool
> > > > that can be shared between interfaces. However I have not
> > > > researched/tried it.
> > >
> > > I thought about it before, but the problem is cq and fq are per-umem.
> > > So when having only one umem shared with many queues or devices,
> > > each one has to acquire a lock, then they can access cq or fq. I think
> > > that might become much slower.
> >
> > You basically have to implement a mempool that can be used by multiple
> > processes. Unfortunately, there is no lean and mean standalone
> > implementation of a mempool. There is a good one in DPDK, but then you
> > get the whole DPDK package into your application which is likely what
> > you wanted to avoid in the first place. Anyone for writing libmempool?
> >
> > /Magnus
> >
>
> That's interesting.
> Do you mean the DPDK's rte_mempool which supports multiple-producer?

Yes.

> If I create a shared umem for queue1  and queue2, then each queue has its
> own tx/rx ring so they can process in parallel. But for handling the per-umem
> cq/fq, I can create a dedicated thread to process cq/fq.
> So for example:
> Thread 1 for handling cq/fq
> Thread 2 for processing queue1 tx/rx queue
> Thread 3 for processing queue2 tx/rx queue
> and the mempool should allow multiple producer and consumer.
>
> Does this sound correct?

You do not need a dedicated process. Just something in the mempool
code that enforces mutual exclusion (a mutex or whatever) between
thread 2 and 3 when they are performing operations on the mempool.
Going with a dedicated process sounds complicated.

/Magnus

> Thanks
> Wiliam
>
> > > > Maybe Magnus can confirm?
> > > >
> > > >
> > > > > Július
> > > > >
> > > > > -----Original Message-----
> > > > > From: Eelco Chaudron [mailto:echaudro@xxxxxxxxxx]
> > > > > Sent: Monday, September 23, 2019 3:02 PM
> > > > > To: Július Milan <Julius.Milan@xxxxxxxxxxxxx>
> > > > > Cc: Magnus Karlsson <magnus.karlsson@xxxxxxxxx>; William Tu
> > > > > <u9012063@xxxxxxxxx>; Björn Töpel <bjorn.topel@xxxxxxxxx>; Marek
> > > > > Závodský <marek.zavodsky@xxxxxxxxxxxxx>; Jesper Dangaard Brouer
> > > > > <brouer@xxxxxxxxxx>; xdp-newbies@xxxxxxxxxxxxxxx; Karlsson, Magnus
> > > > > <magnus.karlsson@xxxxxxxxx>; Thomas F Herbert <therbert@xxxxxxxxxx>;
> > > > > Kevin Laatz <kevin.laatz@xxxxxxxxx>
> > > > > Subject: Re: AF_XDP integration with FDio VPP? (Was: Questions about
> > > > > XDP)
> > > > >
> > > > >
> > > > >
> > > > > On 23 Sep 2019, at 11:00, Július Milan wrote:
> > > > >
> > > > >> Many Thanks Magnus
> > > > >>
> > > > >>>> I have next 2 questions:
> > > > >>>>
> > > > >>>> 1] When I use xsk_ring_prod__reserve and successive
> > > > >>>> xsk_ring_prod__submit. Is it correct to submit also less than I
> > > > >>>> reserved?
> > > > >>>>     In some cases I can't exactly determine how much to reserve in
> > > > >>>> advance, since vpp buffers have different size than xdp frames.
> > > > >>>
> > > > >>> Let me see so I understand this correctly. Ponder you reserve 10
> > > > >>> slots and later submit 4. This means you have reserved 6 more than
> > > > >>> you need.
> > > > >>> Do you want to "unreserve" these and give them back to the ring?
> > > > >>> This
> > > > >>> is not supported by the interface today. Another way of solving this
> > > > >>> (if this is your problem and I am understanding it correctly, that
> > > > >>> is) is that you in the next iteration only reserve 10 - 6 = 4 slots
> > > > >>> because you already have 6 slots available from the last iteration.
> > > > >>> You could still submit 10 after this. But adding something like an
> > > > >>> unreserve option would be easy as long as we made sure it only
> > > > >>> affected local ring state. The global state seen in the shared
> > > > >>> variables between user space and kernel would not be touched, as
> > > > >>> this
> > > > >>> would affect performance negatively. Please let me know what you
> > > > >>> think.
> > > > >>>
> > > > >> Yes, You understand it correctly, I implemented it the way you
> > > > >> suggested, i.e. by marking index and count of reserved slots (not
> > > > >> committed yet, but works well), thanks again.
> > > > >>
> > > > >>>> 2] Can I use hugepage backed memory for umem? If not, is it planned
> > > > >>>> for future?
> > > > >>>>     Yet it does copy pakets from rx rings to vpp buffers, but
> > > > >>>> speculating about straight zerocopy way.
> > > > >>>
> > > > >>> Yes you can use huge pages today, but the internal AF_XDP code has
> > > > >>> not been optimized to use huge pages, so you will not get the full
> > > > >>> benefit from them today. Kevin Laatz, added to this mail, is working
> > > > >>> on optimizing the AF_XDP code for huge pages. If you want to know
> > > > >>> more or have some requirements, do not hesitate to contact him.
> > > > >>>
> > > > >> Kevin will the API for using hugepages change while optimization
> > > > >> process significantly or can I already start to rewrite my vpp driver
> > > > >> to use hugepages backed memory?
> > > > >> Also please let me know, when you consider AF_XDP code optimized to
> > > > >> use huge pages.
> > > > >>
> > > > >> William, if I may ask next question.
> > > > >> Does OVS implementation of af_xdp driver copy paket data from af_xdp
> > > > >> mmaped ring buffers into OVS "buffers" (some structure to represent
> > > > >> the packet in OVS) or is it zerocopy in this manner, i.e. OVS
> > > > >> "buffers" mempool is directly mmaped as ring and so no copy on RX is
> > > > >> needed. Since in 2nd case it would be very valuable for me as
> > > > >> inspiration.
> > > > >
> > > > > Currently, OVS uses the mmaped memory directly, however on egress, it
> > > > > is copying the memory to the egress interface it’s mmaped memory.
> > > > > Currently, OVS uses an AF_XDP memory pool per interface, so a further
> > > > > optimization could be to use a global memory pool so this extra copy
> > > > > is not needed.
> > > > >
> > > > >>
> > > > >>> /Magnus
> > > > >>>
> > > > >>
> > > > >> Thanks a lot,
> > > > >>
> > > > >> Július