On Thu, Jan 05, 2023 at 10:16:42AM -0800, Jakub Kicinski wrote: > On Thu, 5 Jan 2023 11:57:32 -0500 Andy Gospodarek wrote: > > > So my main concern would be that if we "allow" this, the only way to > > > write an interoperable XDP program will be to use bpf_xdp_load_bytes() > > > for every packet access. Which will be slower than DPA, so we may end up > > > inadvertently slowing down all of the XDP ecosystem, because no one is > > > going to bother with writing two versions of their programs. Whereas if > > > you can rely on packet headers always being in the linear part, you can > > > write a lot of the "look at headers and make a decision" type programs > > > using just DPA, and they'll work for multibuf as well. > > > > The question I would have is what is really the 'slow down' for > > bpf_xdp_load_bytes() vs DPA? I know you and Jesper can tell me how many > > instructions each use. :) > > Until we have an efficient and inlined DPA access to frags an > unconditional memcpy() of the first 2 cachelines-worth of headers > in the driver must be faster than a piece-by-piece bpf_xdp_load_bytes() > onto the stack, right? 100% Seems like we are back to speed vs ease of use, then? > > Taking a step back...years ago Dave mentioned wanting to make XDP > > programs easy to write and it feels like using these accessor APIs would > > help accomplish that. If the kernel examples use bpf_xdp_load_bytes() > > accessors everywhere then that would accomplish that. > > I've been pushing for an skb_header_pointer()-like helper but > the semantics were not universally loved :) I didn't recall that -- maybe I'll check the archives and see what I can find.