Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 10, 2022 at 3:30 PM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> writes:
>
> > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
> >>
> >> Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> writes:
> >>
> >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
> >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote:
> >> >> >
> >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
> >> >> > returning a packet pointer with a fixed immutable range. This can be useful to
> >> >> > enable DPA without having to use memcpy (currently the case in
> >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
> >> >> >
> >> >> > The intended usage to read and write data for multi-buff XDP is:
> >> >> >
> >> >> >         int err = 0;
> >> >> >         char buf[N];
> >> >> >
> >> >> >         off &= 0xffff;
> >> >> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
> >> >> >         if (unlikely(!ptr)) {
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >                 ptr = buf;
> >> >> >         }
> >> >> >         ...
> >> >> >         // Do some stores and loads in [ptr, ptr + N) region
> >> >> >         ...
> >> >> >         if (unlikely(ptr == buf)) {
> >> >> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >         }
> >> >> >
> >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
> >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
> >> >> > it is also more meaningful to the user to see return value as R0=pkt.
> >> >> >
> >> >> > This series is meant to collect feedback on the approach, next version can
> >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
> >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
> >> >> > like check_mem_region_access (var_off + off < range), since there would be no
> >> >> > data_end to compare against and obtain a new range.
> >> >> >
> >> >> > The common name and func_id is supposed to allow writing generic code using
> >> >> > bpf_packet_pointer that works for both XDP and TC programs.
> >> >> >
> >> >> > Please see the individual patches for implementation details.
> >> >> >
> >> >>
> >> >> Joanne is working on a "bpf_dynptr" framework that will support
> >> >> exactly this feature, in addition to working with dynamically
> >> >> allocated memory, working with memory of statically unknown size (but
> >> >> safe and checked at runtime), etc. And all that within a generic
> >> >> common feature implemented uniformly within the verifier. E.g., it
> >> >> won't need any of the custom bits of logic added in patch #2 and #3.
> >> >> So I'm thinking that instead of custom-implementing a partial case of
> >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
> >> >> and do it only once there?
> >> >>
> >> >
> >> > Interesting stuff, looking forward to it.
> >> >
> >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
> >> >> where input constant is used to characterize some pointer returned
> >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
> >> >> that for bpf_dynptr for exactly this "give me direct access of N
> >> >> bytes, if possible" case. So improving/generalizing it now before
> >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
> >> >> feature itself.
> >> >
> >> > No worries, we can continue the discussion in patch 1, I'll split out the arg
> >> > changes into a separate patch, and wait for dynptr to be posted before reworking
> >> > this.
> >>
> >> This does raise the question of what we do in the meantime, though? Your
> >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
> >> making it, really has to go in before those hit a release and become
> >> UAPI.
> >>
> >> One option would be to still make the change to those other helpers;
> >> they'd become a bit slower, but if we have a solution for that coming,
> >> that may be OK for a single release? WDYT?
> >
> > I must have missed important changes to bpf_xdp_{load,store}_bytes().
> > Does anything change about its behavior? If there are some fixes
> > specific to those helpers, we should fix them as well as a separate
> > patch. My main objection is adding a bpf_packet_pointer() special case
> > when we have a generic mechanism in the works that will come this use
> > case (among other use cases).
>
> Well it's not a functional change per se, but Kartikeya's patch is
> removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the
> use of the bpf_xdp_pointer()) in favour of making it available directly
> to BPF. So if we don't do that change before those helpers are
> finalised, we will end up either introducing a performance regression
> for code using those helpers, or being stuck with the bpf_xdp_pointer()
> use inside them even though it makes more sense to move it out to BPF.
>
> So the "safe" thing to do would do the change to the store/load helpers
> now, and get rid of the bpf_xdp_pointer() entirely until it can be
> introduced as a BPF helper in a generic way. Of course this depends on
> whether you consider performance regressions to be something to avoid,
> but this being XDP IMO we should :)

I don't follow this logic.
Would you mean by "get rid of the bpf_xdp_pointer" ?
It's just an internal static function.

Also I don't believe that this patch set and exposing
bpf_xdp_pointer as a helper actually gives measurable performance wins.
It looks quirky to me and hard to use.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux