RE: AF_XDP integration with FDio VPP? (Was: Questions about XDP)

Július Milan <Julius.Milan@xxxxxxxxxxxxx> · Wed, 25 Sep 2019 06:46:32 +0000

Hi Eelco

> Currently, OVS uses the mmaped memory directly, however on egress, it is copying the memory to the egress interface it’s mmaped memory. 
Great, thanks for making this clear to me.

> Currently, OVS uses an AF_XDP memory pool per interface, so a further optimization could be to use a global memory pool so this extra copy is not needed.
Is it even possible to make this further optimization? Since every interface has it's own non-shared umem, so from my point of view, at least one
copy for case as you described above (when RX interface is different then TX interface) is necessery. Or am I missing something?

Július

-----Original Message-----
From: Eelco Chaudron [mailto:echaudro@xxxxxxxxxx] 
Sent: Monday, September 23, 2019 3:02 PM
To: Július Milan <Julius.Milan@xxxxxxxxxxxxx>
Cc: Magnus Karlsson <magnus.karlsson@xxxxxxxxx>; William Tu <u9012063@xxxxxxxxx>; Björn Töpel <bjorn.topel@xxxxxxxxx>; Marek Závodský <marek.zavodsky@xxxxxxxxxxxxx>; Jesper Dangaard Brouer <brouer@xxxxxxxxxx>; xdp-newbies@xxxxxxxxxxxxxxx; Karlsson, Magnus <magnus.karlsson@xxxxxxxxx>; Thomas F Herbert <therbert@xxxxxxxxxx>; Kevin Laatz <kevin.laatz@xxxxxxxxx>
Subject: Re: AF_XDP integration with FDio VPP? (Was: Questions about XDP)

On 23 Sep 2019, at 11:00, Július Milan wrote:

> Many Thanks Magnus
>
>>> I have next 2 questions:
>>>
>>> 1] When I use xsk_ring_prod__reserve and successive 
>>> xsk_ring_prod__submit. Is it correct to submit also less than I 
>>> reserved?
>>>     In some cases I can't exactly determine how much to reserve in 
>>> advance, since vpp buffers have different size than xdp frames.
>>
>> Let me see so I understand this correctly. Ponder you reserve 10 
>> slots and later submit 4. This means you have reserved 6 more than 
>> you need.
>> Do you want to "unreserve" these and give them back to the ring? This 
>> is not supported by the interface today. Another way of solving this 
>> (if this is your problem and I am understanding it correctly, that
>> is) is that you in the next iteration only reserve 10 - 6 = 4 slots 
>> because you already have 6 slots available from the last iteration.
>> You could still submit 10 after this. But adding something like an 
>> unreserve option would be easy as long as we made sure it only 
>> affected local ring state. The global state seen in the shared 
>> variables between user space and kernel would not be touched, as this 
>> would affect performance negatively. Please let me know what you 
>> think.
>>
> Yes, You understand it correctly, I implemented it the way you 
> suggested, i.e. by marking index and count of reserved slots (not 
> committed yet, but works well), thanks again.
>
>>> 2] Can I use hugepage backed memory for umem? If not, is it planned 
>>> for future?
>>>     Yet it does copy pakets from rx rings to vpp buffers, but 
>>> speculating about straight zerocopy way.
>>
>> Yes you can use huge pages today, but the internal AF_XDP code has 
>> not been optimized to use huge pages, so you will not get the full 
>> benefit from them today. Kevin Laatz, added to this mail, is working 
>> on optimizing the AF_XDP code for huge pages. If you want to know 
>> more or have some requirements, do not hesitate to contact him.
>>
> Kevin will the API for using hugepages change while optimization 
> process significantly or can I already start to rewrite my vpp driver 
> to use hugepages backed memory?
> Also please let me know, when you consider AF_XDP code optimized to 
> use huge pages.
>
> William, if I may ask next question.
> Does OVS implementation of af_xdp driver copy paket data from af_xdp 
> mmaped ring buffers into OVS "buffers" (some structure to represent 
> the packet in OVS) or is it zerocopy in this manner, i.e. OVS 
> "buffers" mempool is directly mmaped as ring and so no copy on RX is 
> needed. Since in 2nd case it would be very valuable for me as 
> inspiration.

Currently, OVS uses the mmaped memory directly, however on egress, it is copying the memory to the egress interface it’s mmaped memory. 
Currently, OVS uses an AF_XDP memory pool per interface, so a further optimization could be to use a global memory pool so this extra copy is not needed.

>
>> /Magnus
>>
>
> Thanks a lot,
>
> Július