On Thu, 5 Jan 2023 11:57:32 -0500 Andy Gospodarek wrote: > > So my main concern would be that if we "allow" this, the only way to > > write an interoperable XDP program will be to use bpf_xdp_load_bytes() > > for every packet access. Which will be slower than DPA, so we may end up > > inadvertently slowing down all of the XDP ecosystem, because no one is > > going to bother with writing two versions of their programs. Whereas if > > you can rely on packet headers always being in the linear part, you can > > write a lot of the "look at headers and make a decision" type programs > > using just DPA, and they'll work for multibuf as well. > > The question I would have is what is really the 'slow down' for > bpf_xdp_load_bytes() vs DPA? I know you and Jesper can tell me how many > instructions each use. :) Until we have an efficient and inlined DPA access to frags an unconditional memcpy() of the first 2 cachelines-worth of headers in the driver must be faster than a piece-by-piece bpf_xdp_load_bytes() onto the stack, right? > Taking a step back...years ago Dave mentioned wanting to make XDP > programs easy to write and it feels like using these accessor APIs would > help accomplish that. If the kernel examples use bpf_xdp_load_bytes() > accessors everywhere then that would accomplish that. I've been pushing for an skb_header_pointer()-like helper but the semantics were not universally loved :)