Re: AF_XDP integration with FDio VPP? (Was: Questions about XDP)

"Eelco Chaudron" <echaudro@xxxxxxxxxx> · Wed, 25 Sep 2019 09:48:01 +0200

On 25 Sep 2019, at 8:46, Július Milan wrote:

Hi Eelco

Currently, OVS uses the mmaped memory directly, however on egress, it 

is copying the memory to the egress interface it’s mmaped memory.

Great, thanks for making this clear to me.

Currently, OVS uses an AF_XDP memory pool per interface, so a further 

optimization could be to use a global memory pool so this extra copy 

is not needed.

Is it even possible to make this further optimization? Since every 

interface has it's own non-shared umem, so from my point of view, at 

least one

copy for case as you described above (when RX interface is different 

then TX interface) is necessery. Or am I missing something?

Some one @Intel told me it would be possible to have one huge mempool 

that can be shared between interfaces. However I have not 

researched/tried it.

Maybe Magnus can confirm?

Július

-----Original Message-----
From: Eelco Chaudron [mailto:echaudro@xxxxxxxxxx]
Sent: Monday, September 23, 2019 3:02 PM
To: Július Milan <Julius.Milan@xxxxxxxxxxxxx>

Cc: Magnus Karlsson <magnus.karlsson@xxxxxxxxx>; William Tu 

<u9012063@xxxxxxxxx>; Björn Töpel <bjorn.topel@xxxxxxxxx>; Marek 

Závodský <marek.zavodsky@xxxxxxxxxxxxx>; Jesper Dangaard Brouer 

<brouer@xxxxxxxxxx>; xdp-newbies@xxxxxxxxxxxxxxx; Karlsson, Magnus 

<magnus.karlsson@xxxxxxxxx>; Thomas F Herbert <therbert@xxxxxxxxxx>; 

Kevin Laatz <kevin.laatz@xxxxxxxxx>

Subject: Re: AF_XDP integration with FDio VPP? (Was: Questions about 

XDP)

On 23 Sep 2019, at 11:00, Július Milan wrote:

Many Thanks Magnus

I have next 2 questions:

1] When I use xsk_ring_prod__reserve and successive
xsk_ring_prod__submit. Is it correct to submit also less than I
reserved?
    In some cases I can't exactly determine how much to reserve in
advance, since vpp buffers have different size than xdp frames.

Let me see so I understand this correctly. Ponder you reserve 10
slots and later submit 4. This means you have reserved 6 more than
you need.

Do you want to "unreserve" these and give them back to the ring? 

This

is not supported by the interface today. Another way of solving this
(if this is your problem and I am understanding it correctly, that
is) is that you in the next iteration only reserve 10 - 6 = 4 slots
because you already have 6 slots available from the last iteration.
You could still submit 10 after this. But adding something like an
unreserve option would be easy as long as we made sure it only
affected local ring state. The global state seen in the shared

variables between user space and kernel would not be touched, as 

this

would affect performance negatively. Please let me know what you
think.

Yes, You understand it correctly, I implemented it the way you
suggested, i.e. by marking index and count of reserved slots (not
committed yet, but works well), thanks again.

2] Can I use hugepage backed memory for umem? If not, is it planned
for future?
    Yet it does copy pakets from rx rings to vpp buffers, but
speculating about straight zerocopy way.

Yes you can use huge pages today, but the internal AF_XDP code has
not been optimized to use huge pages, so you will not get the full
benefit from them today. Kevin Laatz, added to this mail, is working
on optimizing the AF_XDP code for huge pages. If you want to know
more or have some requirements, do not hesitate to contact him.

Kevin will the API for using hugepages change while optimization
process significantly or can I already start to rewrite my vpp driver
to use hugepages backed memory?
Also please let me know, when you consider AF_XDP code optimized to
use huge pages.

William, if I may ask next question.
Does OVS implementation of af_xdp driver copy paket data from af_xdp
mmaped ring buffers into OVS "buffers" (some structure to represent
the packet in OVS) or is it zerocopy in this manner, i.e. OVS
"buffers" mempool is directly mmaped as ring and so no copy on RX is
needed. Since in 2nd case it would be very valuable for me as
inspiration.

Currently, OVS uses the mmaped memory directly, however on egress, it 

is copying the memory to the egress interface it’s mmaped memory.

Currently, OVS uses an AF_XDP memory pool per interface, so a further 

optimization could be to use a global memory pool so this extra copy 

is not needed.

/Magnus

Thanks a lot,

Július