Re: AF_XDP umem and jumbo frames?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 5 Oct 2018 15:56:31 -0400
Justin Azoff <justin.azoff@xxxxxxxxx> wrote:

> > People on this list might not realize that there is a significant
> > overhead in supporting larger that 4K frames for XDP, that is larger
> > than one memory-page. So let me explain...
> >
> > It is actually trivially easy for XDP to support jumbo frames, if the
> > NIC hardware supports storing RX frames into higher order pages (aka
> > compound pages, more 4K pages physically after each-other) which most
> > HW does. (Page order0 = 4KB, order1=8KB, order2=16KB, order3=32KB).
> > As then XDP will work out-of-the-box, as the requirement is really that
> > packet-payload is layout as phys continuous memory.
> >  
> 
> For the use cases of XDP_DROP or XDP_PASS, could XDP send as much of
> the packet that fits in a single page up to the ebpf program and allow
> decisions based on that?
> 
> For the flow bypass, ddos drop stuff, you only need the l3 header to
> make the PASS/DROP decision, not the entire packet.
> 
> I suppose this would be a bit more complicated for modifying headers
> and using XDP_TX.

The key in your question is just "bit more complicated", then we can
support feature "X".  For XDP is designed for performance where every
nanosec counts.  Feature creep will slowly but surely kill this
performance edge.

I'll try to explain the overhead of jumbo-frame again, with another
angle.  XDP have gained performance up-front by saying we don't support
jumbo-frames. As instead of (per RX packet) allocating 3x 4KB pages, we
only need to alloc a single 4KB page.  That in itself is a huge
performance win.  Are you saying that you want a feature, that is used
in 1-5% use-cases, that in general is going to slowdown the baseline
performance of XDP?

One thing I realize is that people on this list, are perhaps not
familiar how NIC RX (via DMA) works.  On RX, we cannot know the RX
packet size up-front.  Thus, when filling the NIC RX-ring memory slots,
then we have to allocated room for the "worse-case", e.g. 9000Bytes is
minimum 3x4K=12K, and due to page-alloc limits min 4x4K=16K.  Thus,
regardless of packet length the alloc size is the same.  (I will not go
into detail on how different drivers tries to reduce this mem-overhead,
but only say that those tricks costs CPU cycles).

A last word of adding features to XDP: When adding features, I look
long and hard for ways that the features checks can be pushed to setup
time, rather than runtime.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer



[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux