On Fri, 5 Oct 2018 11:47:25 -0700 Zvi Effron <zeffron@xxxxxxxxxxxxx> wrote: > If the requirement is just for contiguous memory, could this be > resolved with allowing the driver to request multiple contiguous 4KB > pages instead of one higher order page? Does that reduce the cost? (Or > does it actually increase it?) Sorry, but your question does not make sense. A higher order page _is_ multiple contiguous 4KB pages. Thus, the answer is that it is the same. --Jesper > On Thu, Oct 4, 2018 at 12:45 PM Jesper Dangaard Brouer > <brouer@xxxxxxxxxx> wrote: > > > > On Thu, 4 Oct 2018 08:47:45 -0700 > > Rob Sherwood <rob.sherwood@xxxxxxxxx> wrote: > > > > > [not speaking for my current employer, but just from past experience ] > > > > > > Certainly a lot of the 'hard' requirements (hard meaning - "without > > > this it won't work") I've seen could be served with a ~3k non-full > > > jumbo frame. > > > > Glad to hear that _most_ use-cases can be solved with a ~3k non-full > > jumbo-frame. > > > > > But at least what I've seen in the past was that because > > > many of the host-side operations are per-packet limited (e.g., because > > > of CPU or RAM, but ultimately turns into a max pps per host), a > > > trivial way to increase application performance/reduce CPU for > > > networking was to run at as large a frame size as possible. For > > > example, if your application/host is really pps limited, then getting > > > the frame size to increase from 3k to 9k means either 3x more > > > bandwidth for the same cpu usage (assuming the application is > > > bandwidth limited) or 1/3x the CPU usage for the same bandwidth (if > > > the application is not bandwidth limited). Either way, IMHO it's a > > > pretty big win. > > > > With XDP we have basically solved the issue of being PPS (packets per > > sec) limited. And we can avoid these workarounds of using jumbo frames. > > That is why it is a bit provoking to ask for jumbo-frames ;-) > > > > > > People on this list might not realize that there is a significant > > overhead in supporting larger that 4K frames for XDP, that is larger > > than one memory-page. So let me explain... > > > > It is actually trivially easy for XDP to support jumbo frames, if the > > NIC hardware supports storing RX frames into higher order pages (aka > > compound pages, more 4K pages physically after each-other) which most > > HW does. (Page order0 = 4KB, order1=8KB, order2=16KB, order3=32KB). > > As then XDP will work out-of-the-box, as the requirement is really that > > packet-payload is layout as phys continuous memory. > > > > Kernel page allocator can give us high-order pages, sure, but is cost > > more, see slide 12 of [1]. The large jump to order-1, is because > > order-0 have a Per-Cpu-Pages (PCP) cache. From order-1 and above, the > > page allocator goes through a central (per NUMA) lock, which makes > > thing even worse, as this does not scale to multiple CPUs. And there > > is also the point of wasting memory when processing 64Byte packets. > > So, it is not 100% of the picture, that we could support jumbo-frames > > for XDP. Mostly because we can workaround this cost/issue, by having > > recycle caches for these pages, which we even do for order-0 pages. > > Hint, I actually left this door open, as you can specify page-order > > when setting up the page_pool API in the driver... > > > > [1] http://people.netfilter.org/hawk/presentations/MM-summit2017/MM-summit2017-JesperBrouer.pdf > > > > -- > > Best regards, > > Jesper Dangaard Brouer > > MSc.CS, Principal Kernel Engineer at Red Hat > > LinkedIn: http://www.linkedin.com/in/brouer > > > > > > > > > > > On Thu, Oct 4, 2018 at 12:52 AM Jesper Dangaard Brouer > > > <brouer@xxxxxxxxxx> wrote: > > > > > > > > On Thu, 4 Oct 2018 08:44:27 +0200 > > > > Björn Töpel <bjorn.topel@xxxxxxxxx> wrote: > > > > > > > > > Den tors 27 sep. 2018 kl 02:56 skrev Rob Sherwood <rob.sherwood@xxxxxxxxx>: > > > > > > > > > > > > Thanks for the reference and the page-per-packet point makes sense. > > > > > > At the same time, not supporting jumbo frames seems like a non-trivial > > > > > > limitation. Are there a subset of drivers that do support jumbo > > > > > > frames (or LRO or the other features that require multiple pages per > > > > > > packet)? > > > > > > > > > > > > > > > > No, not at the moment. XDP has a strict "one frame cannot exceed a > > > > > page" constraint. Everything that applies to XDP in terms of > > > > > constraints, applies to AF_XDP as well. > > > > > > > > > > Just to clarify, XDP supports jumbo frames -- i.e. larger than 1500B > > > > > payload, just not the maximum 9000B size. My personal observation is > > > > > that many deployments that "require jumbo frames", are usually OK with > > > > > an of MTU ~3000B. Jumbo frames, yes. Full jumbo frames, no. :-) > > > > > > > > Thank you for clarifying that Bjørn. > > > > > > > > Can Alex or Rob explain: > > > > > > > > (1) What is your use-case for wanting jumbo-frames? > > > > > > > > And (2) will an MTU of ~3000Bytes be sufficient? (which XDP does support) > > > > > > > > > > > > > > On Tue, Sep 25, 2018 at 9:44 AM Alex Forster <aforster@xxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > On my test box running 4.18 if XDP is in use the MTU can not be > > > > > > > > set higher than 3050. > > > > > > > > > > > > > > Ah, that answers a few questions for me. Thanks! > > > > > > > > > > > > > > Alex Forster > > > > > > > > -- > > > > Best regards, > > > > Jesper Dangaard Brouer > > > > MSc.CS, Principal Kernel Engineer at Red Hat > > > > LinkedIn: http://www.linkedin.com/in/brouer > > > > -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer