On Mon, Jan 30, 2023 at 5:06 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Mon, Jan 30, 2023 at 4:56 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > On Mon, Jan 30, 2023 at 4:48 PM Andrii Nakryiko > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > On Fri, Jan 27, 2023 at 11:18 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > > > > > Add skb dynptrs, which are dynptrs whose underlying pointer points > > > > to a skb. The dynptr acts on skb data. skb dynptrs have two main > > > > benefits. One is that they allow operations on sizes that are not > > > > statically known at compile-time (eg variable-sized accesses). > > > > Another is that parsing the packet data through dynptrs (instead of > > > > through direct access of skb->data and skb->data_end) can be more > > > > ergonomic and less brittle (eg does not need manual if checking for > > > > being within bounds of data_end). > > > > > > > > For bpf prog types that don't support writes on skb data, the dynptr is > > > > read-only (bpf_dynptr_write() will return an error and bpf_dynptr_data() > > > > will return a data slice that is read-only where any writes to it will > > > > be rejected by the verifier). > > > > > > > > For reads and writes through the bpf_dynptr_read() and bpf_dynptr_write() > > > > interfaces, reading and writing from/to data in the head as well as from/to > > > > non-linear paged buffers is supported. For data slices (through the > > > > bpf_dynptr_data() interface), if the data is in a paged buffer, the user > > > > must first call bpf_skb_pull_data() to pull the data into the linear > > > > portion. > > > > > > > > Any bpf_dynptr_write() automatically invalidates any prior data slices > > > > to the skb dynptr. This is because a bpf_dynptr_write() may be writing > > > > to data in a paged buffer, so it will need to pull the buffer first into > > > > the head. The reason it needs to be pulled instead of writing directly to > > > > the paged buffers is because they may be cloned (only the head of the skb > > > > is by default uncloned). As such, any bpf_dynptr_write() will > > > > automatically have its prior data slices invalidated, even if the write > > > > is to data in the skb head (the verifier has no way of differentiating > > > > whether the write is to the head or paged buffers during program load > > > > time). Please note as well that any other helper calls that change the > > > > underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data > > > > slices of the skb dynptr as well. The stack trace for this is > > > > check_helper_call() -> clear_all_pkt_pointers() -> > > > > __clear_all_pkt_pointers() -> mark_reg_unknown(). > > > > > > > > For examples of how skb dynptrs can be used, please see the attached > > > > selftests. > > > > > > > > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > > > > --- > > > > include/linux/bpf.h | 82 +++++++++------ > > > > include/linux/filter.h | 18 ++++ > > > > include/uapi/linux/bpf.h | 37 +++++-- > > > > kernel/bpf/btf.c | 18 ++++ > > > > kernel/bpf/helpers.c | 95 ++++++++++++++--- > > > > kernel/bpf/verifier.c | 185 ++++++++++++++++++++++++++------- > > > > net/core/filter.c | 60 ++++++++++- > > > > tools/include/uapi/linux/bpf.h | 37 +++++-- > > > > 8 files changed, 432 insertions(+), 100 deletions(-) > > > > > > > > > > [...] > > > > > > > static const struct bpf_func_proto bpf_dynptr_write_proto = { > > > > @@ -1560,6 +1595,8 @@ static const struct bpf_func_proto bpf_dynptr_write_proto = { > > > > > > > > BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset, u32, len) > > > > { > > > > + enum bpf_dynptr_type type; > > > > + void *data; > > > > int err; > > > > > > > > if (!ptr->data) > > > > @@ -1569,10 +1606,36 @@ BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset, u3 > > > > if (err) > > > > return 0; > > > > > > > > - if (bpf_dynptr_is_rdonly(ptr)) > > > > - return 0; > > > > + type = bpf_dynptr_get_type(ptr); > > > > + > > > > + switch (type) { > > > > + case BPF_DYNPTR_TYPE_LOCAL: > > > > + case BPF_DYNPTR_TYPE_RINGBUF: > > > > + if (bpf_dynptr_is_rdonly(ptr)) > > > > + return 0; > > > > > > will something break if we return ptr->data for read-only LOCAL/RINGBUF dynptr? > > > > There will be nothing guarding against direct writes into read-only > > LOCAL/RINGBUF dynptrs if we return ptr->data. For skb type dynptrs, > > it's guarded by the ptr->data return pointer being marked as > > MEM_RDONLY in the verifier if the skb is non-writable. > > > > Ah, so we won't add MEM_RDONLY for bpf_dynptr_data()'s returned > PTR_TO_MEM if we know (statically) that dynptr is read-only? I think you meant will, not won't? If so, then yes we only add MEM_RDONLY for the returned data slice if we can pre-determine that the dynptr is read-only, else bpf_dynptr_data() will return null. > > Ok, not a big deal, just something that we might want to improve in the future. I'm curious to hear how you think this could be improved. If we're not able to know statically whether the dynptr is read-only or writable, then there's no way to enforce it in the verifier before the bpf program runs. Or is there some way to do this? > > > > > > > > + > > > > + data = ptr->data; > > > > + break; > > > > + case BPF_DYNPTR_TYPE_SKB: > > > > + { > > > > + struct sk_buff *skb = ptr->data; > > > > > > > > > > [...]