On 10/06, Maryam Tahhan wrote:
On 05/10/2022 19:47, sdf@xxxxxxxxxx wrote:
> On 10/05, Toke H�iland-J�rgensen wrote:
> > Stanislav Fomichev <sdf@xxxxxxxxxx> writes:
>
> > > On Tue, Oct 4, 2022 at 5:59 PM Jakub Kicinski <kuba@xxxxxxxxxx>
wrote:
> > >>
> > >> On Tue, 4 Oct 2022 17:25:51 -0700 Martin KaFai Lau wrote:
> > >> > A intentionally wild question, what does it take for the driver
> > to return the
> > >> > hints. Is the rx_desc and rx_queue enough? When the xdp prog
> > is calling a
> > >> > kfunc/bpf-helper, like 'hwtstamp = bpf_xdp_get_hwtstamp()', can
> > the driver
> > >> > replace it with some inline bpf code (like how the inline code
> > is generated for
> > >> > the map_lookup helper). The xdp prog can then store the
> > hwstamp in the meta
> > >> > area in any layout it wants.
> > >>
> > >> Since you mentioned it... FWIW that was always my preference
> > rather than
> > >> the BTF magic :) The jited image would have to be per-driver like
we
> > >> do for BPF offload but that's easy to do from the technical
> > >> perspective (I doubt many deployments bind the same prog to
multiple
> > >> HW devices)..
> > >
> > > +1, sounds like a good alternative (got your reply while typing)
> > > I'm not too versed in the rx_desc/rx_queue area, but seems like
worst
> > > case that bpf_xdp_get_hwtstamp can probably receive a xdp_md ctx and
> > > parse it out from the pre-populated metadata?
> > >
> > > Btw, do we also need to think about the redirect case? What happens
> > > when I redirect one frame from a device A with one metadata format
to
> > > a device B with another?
>
> > Yes, we absolutely do! In fact, to me this (redirects) is the main
> > reason why we need the ID in the packet in the first place: when
running
> > on (say) a veth, an XDP program needs to be able to deal with packets
> > from multiple physical NICs.
>
> > As far as API is concerned, my hope was that we could solve this with
a
> > CO-RE like approach where the program author just writes something
like:
>
> > hw_tstamp = bpf_get_xdp_hint("hw_tstamp", u64);
>
> > and bpf_get_xdp_hint() is really a macro (or a special kind of
> > relocation?) and libbpf would do the following on load:
>
> > - query the kernel BTF for all possible xdp_hint structs
> > - figure out which of them have an 'u64 hw_tstamp' member
> > - generate the necessary conditionals / jump table to disambiguate on
> > the BTF_ID in the packet
>
>
> > Now, if this is better done by a kfunc I'm not terribly opposed to
that
> > either, but I'm not sure it's actually better/easier to do in the
kernel
> > than in libbpf at load time?
>
> Replied in the other thread, but to reiterate here: then btf_id in the
> metadata has to stay and we either pre-generate those bpf_get_xdp_hint()
> at libbpf or at kfunc load time level as you mention.
>
> But the program essentially has to handle all possible hints' btf ids
> thrown
> at it by the system. Not sure about the performance in this case :-/
> Maybe that's something that can be hidden behind "I might receive
forwarded
> packets and I know how to handle all metadata format" flag? By default,
> we'll pre-generate parsing only for that specific device?
I did a simple POC of Jespers xdp-hints with AF-XDP and CNDP (Cloud Native
Data Plane). In the cases where my app had access to the HW I didn't need
to
handle all possible hints... I knew what Drivers were on the system and
they
were the hints I needed to deal with.
So at program init time I registered the relevant BTF_IDs (and some
callback
functions to handle them) from the NICs that were available to me in a
simple tailq (tbh there were so few I could've probably used a static
array).
When processing the hints then I only needed to invoke the appropriate
callback function based on the received BTF_ID. I didn't have a massive
chains of if...else if... else statements.
In the case where we have redirection to a virtual NIC and we don't
necessarily know the underlying hints that are exposed to the app, could
we
not still use the xdp_hints (as proposed by Jesper) themselves to indicate
the relevant drivers to the application? or even indicate them via a map
or
something?
Ideally this all should be handled by the common infra (libbpf/libxdp?).
We probably don't want every xdp/af_xdp user to custom-implement all this
btf_id->layout parsing? That's why the request for a selftest that shows
how metadata can be accessed from bpf/af_xdp.