On Tue, Jan 31, 2023 at 4:40 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Tue, Jan 31, 2023 at 04:11:47PM -0800, Andrii Nakryiko wrote: > > > > > > When prog is just parsing the packet it doesn't need to finalize with bpf_dynptr_write. > > > The prog can always write into the pointer followed by if (p == buf) bpf_dynptr_write. > > > No need for rdonly flag, but extra copy is there in case of cloned which > > > could have been avoided with extra rd_only flag. > > > > Yep, given we are designing bpf_dynptr_slice for performance, extra > > copy on reads is unfortunate. ro/rw flag or have separate > > bpf_dynptr_slice_rw vs bpf_dynptr_slice_ro? > > Either flag or two kfuncs sound good to me. Would it make sense to make bpf_dynptr_slice() as read-only variant, and bpf_dynptr_slice_rw() for read/write? I think the common case is read-only, right? And if users mistakenly use bpf_dynptr_slice() for r/w case, they will get a verifier error when trying to write into the returned pointer. While if we make bpf_dynptr_slice() as read-write, users won't realize they are paying a performance penalty for something that they don't actually need. > > > > Yes and No. bpf_skb_store_bytes is doing pull followed by memcpy, > > > while xdp_store_bytes does scatter gather copy into frags. > > > We should probably add similar copy to skb case to avoid allocations and pull. > > > Then in case of: > > > if (p == buf) { > > > bpf_dynptr_write(dp, buf, 16); > > > } > > > > > > the write will guarantee to succeed for both xdp and skb and the user > > > doesn't need to add error checking for alloc failures in case of skb. > > > > > > > That seems like a nice guarantee, agreed. > > Just grepped through few projects that use skb_store_bytes. > Everywhere it looks like: > if (bpf_skb_store_byte(...)) > return error; > > Not a pretty code to read. > I should prioritize bpf_assert() work, so we can assert from inside of > bpf_dynptr_write() eventually and remove all these IFs. > > > > > > > > > > > > But I wonder if for simple cases when users are mostly sure that they > > > > > > are going to access only header data directly we can have an option > > > > > > for bpf_dynptr_from_skb() to specify what should be the behavior for > > > > > > bpf_dynptr_slice(): > > > > > > > > > > > > - either return NULL for anything that crosses into frags (no > > > > > > surprising perf penalty, but surprising NULLs); > > > > > > - do bpf_skb_pull_data() if bpf_dynptr_data() needs to point to data > > > > > > beyond header (potential perf penalty, but on NULLs, if off+len is > > > > > > within packet). > > > > > > > > > > > > And then bpf_dynptr_from_skb() can accept a flag specifying this > > > > > > behavior and store it somewhere in struct bpf_dynptr. > > > > > > > > > > xdp does not have the bpf_skb_pull_data() equivalent, so xdp prog will still > > > > > need the write back handling. > > > > > > > > > > > > > Sure, unfortunately, can't have everything. I'm just thinking how to > > > > make bpf_dynptr_data() generically usable. Think about some common BPF > > > > routine that calculates hash for all bytes pointed to by dynptr, > > > > regardless of underlying dynptr type; it can iterate in small chunks, > > > > get memory slice, if possible, but fallback to generic > > > > bpf_dynptr_read() if doesn't. This will work for skb, xdp, LOCAL, > > > > RINGBUF, any other dynptr type. > > > > > > It looks to me that dynptr on top of skb, xdp, local can work as generic reader, > > > but dynptr as a generic writer doesn't look possible. > > > BPF_F_RECOMPUTE_CSUM and BPF_F_INVALIDATE_HASH are special to skb. > > > There is also bpf_skb_change_proto and crazy complex bpf_skb_adjust_room. > > > I don't think writing into skb vs xdp vs ringbuf are generalizable. > > > The prog needs to do a ton more work to write into skb correctly. > > > > If that's the case, then yeah, bpf_dynptr_write() can just return > > error for skb/xdp dynptrs? > > You mean to error when these skb only flags are present, but dynptr->type == xdp ? > Yep. I don't see another option. My point was that dynptr doesn't quite work as an > abstraction for writing into networking things. agreed > While libraries like: parse_http(&dynptr), compute_hash(&dynptr), find_string(&dynptr) > can indeed be generic and work with raw bytes, skb, xdp as an input, > which I think was on top of your wishlist for dynptr. yep, it would be a great property