Tariq Toukan <ttoukan.linux@xxxxxxxxx> writes: > On 08/01/2023 14:33, Tariq Toukan wrote: >> >> >> On 05/01/2023 20:16, Jakub Kicinski wrote: >>> On Thu, 5 Jan 2023 11:57:32 -0500 Andy Gospodarek wrote: >>>>> So my main concern would be that if we "allow" this, the only way to >>>>> write an interoperable XDP program will be to use bpf_xdp_load_bytes() >>>>> for every packet access. Which will be slower than DPA, so we may >>>>> end up >>>>> inadvertently slowing down all of the XDP ecosystem, because no one is >>>>> going to bother with writing two versions of their programs. Whereas if >>>>> you can rely on packet headers always being in the linear part, you can >>>>> write a lot of the "look at headers and make a decision" type programs >>>>> using just DPA, and they'll work for multibuf as well. >>>> >>>> The question I would have is what is really the 'slow down' for >>>> bpf_xdp_load_bytes() vs DPA? I know you and Jesper can tell me how many >>>> instructions each use. :) >>> >>> Until we have an efficient and inlined DPA access to frags an >>> unconditional memcpy() of the first 2 cachelines-worth of headers >>> in the driver must be faster than a piece-by-piece bpf_xdp_load_bytes() >>> onto the stack, right? >>> >>>> Taking a step back...years ago Dave mentioned wanting to make XDP >>>> programs easy to write and it feels like using these accessor APIs would >>>> help accomplish that. If the kernel examples use bpf_xdp_load_bytes() >>>> accessors everywhere then that would accomplish that. >>> >>> I've been pushing for an skb_header_pointer()-like helper but >>> the semantics were not universally loved :) >> >> Maybe it's time to re-consider. >> >> Is it something like an API that given an offset returns a pointer + >> allowed length to be accessed? >> >> This sounds like a good direction to me, that avoids having any >> linear-part-length assumptions, while preserving good performance. >> >> Maybe we can still require/guarantee that each single header (eth, ip, >> tcp, ...) does not cross a frag/page boundary. For otherwise, a prog >> needs to handle cases where headers span several fragments, so it has to >> reconstruct the header by copying the different parts into some local >> buffer. >> >> This can be achieved by having another assumption that AFAIK already >> holds today: all fragments are of size PAGE_SIZE. >> >> Regards, >> Tariq > > This can be a good starting point: > static void *bpf_xdp_pointer(struct xdp_buff *xdp, u32 offset, u32 len) > > It's currently not exposed as a bpf-helper, and it works a bit > differently to what I mentioned earlier: It gets the desired length, and > fails in case it's not continuously accessible (i.e. this piece of data > spans multiple frags). Did a bit of digging through the mail archives. Exposing bpf_xdp_pointer() as a helper was proposed back in March last year: https://lore.kernel.org/r/20220306234311.452206-1-memxor@xxxxxxxxx The discussion of this seems to have ended on "let's use dynptrs instead". There was a patch series posted for this as well, which seems to have stalled out with this comment from Alexei in October: https://lore.kernel.org/r/CAADnVQKhv2YBrUAQJq6UyqoZJ-FGNQbKenGoPySPNK+GaOjBOg@xxxxxxxxxxxxxx -Toke