Re: AF_XDP metadata/hints

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alexander Lobakin wrote:
> From: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> Date: Sun, 23 May 2021 13:54:47 +0200
> 
> > Saeed Mahameed <saeed@xxxxxxxxxx> writes:
> > 
> > > On Fri, 2021-05-21 at 15:31 +0200, Jesper Dangaard Brouer wrote:
> > >> On Fri, 21 May 2021 10:53:40 +0000
> > >> "Lobakin, Alexandr" <alexandr.lobakin@xxxxxxxxx> wrote:
> > >>
> > >> > I've opened two discussions at https://github.com/alobakin/linux,
> > >> > feel free to join them and/or create new ones to share your thoughts
> > >> > and concerns.
> > >>
> > >> Thanks Alexandr for keeping the thread/subject alive.
> > >>
> > >> I guess this is a new GitHub features "Discussions".  I've never used
> > >> that in a project before, lets see how this goes.  The usual approach
> > >> is discussions over email on netdev (Cc. netdev@xxxxxxxxxxxxxxx).
> > >
> > > I agree we need full visibility and transparency, i actually recommend:
> > > bpf@xxxxxxxxxxxxxxx
> > 
> > +1, please keep this on the list :)
> 
> Sure, let's keep it the classic way.
> I removed the netdev ML from the CCs and added bpf there.
> 
> Regarding the comments from GitHub discussions:
> 
> alobakin:
> 
> > Since 5.11, it's now possible to obtain a BTF not only for vmlinux,
> > but also for modules.
> > This will eliminate a need for manually composing and registering a
> > BTF inside the driver code, which is 100+ locs for ice for example.
> > 
> > That's obviously not the most straightforward and trivial way, but
> > could help a lot.
> 
> saeedtx:
> 
> > the point of registering BTF directly from the driver is to allow

There is no paticular reason the BTF has to come from the driver it
could also be generated in userspace or elsewhere. The driver is
handy because at least the driver should always have correct BTF so
you avoid versioning to some extent.

> > "Flex metadata" meaning that meta data format can be constructed on
> > the fly according to user demand.

How is flex metadata configured? I believe this is going to need
some user tooling and a hard reset (ucode load?) in the driver to
transition the hardware state.

My original vision was use P4 (or whatever language) to build
your necessary microcode/firmware/blob. Compile that to your
specific hardware backend NIC. That process should give you
two objects. The BTF and the blob to throw at the hardware.
Letting the driver expose the BTF over /sys/fs/btf/driver.btf
makes a lot of sense as well, but is not strictly necessary
as long as you have some way to get the BTF.

Anyways from a design side IMO hardware configuration should be
done independent of any BPF/BTF operations.

> > BTF for modules is constructed only at compilation time and
> > registered only on module load. so there is no way to implement flex
> > metadata with vmlinux BTF. we still need a dynamic registration API
> > for current and future HW where the HW will provide the BTF
> > dynamically.

+1 can we expose it in /sys/fs/btf/ seems like the reasonable
thing to me.

> > 
> > I am sure we can find mutliple ways to reduce the 100+ LOC, but the
> > goal is to have the dynamic btf_register/unregister API
> 
> We initially planned to register just one (or several) predefined
> BTF(s) per module/netdevice that would provide a full list of
> supported fields. The flexibility of metadata then is in that BPF
> core calls for netdevice's ndo_bpf() on BPF program setup and
> provides a metadata layout requested by that BPF prog to the driver,

I don't think this is the right direction. The driver should be
telling us whats supported or we should "just" know because we
configured it. Overloading ndo_bpf with the config step
seems unnecessarily complex. CO-RE is going to happen way before
we even get to the ndo_bpf() so trying to decide layout this
late is likely not going to work. How would you even know what
to do with a load op?

> so it could configure its hotpath (current NICs) or a hardware
> (future NICs) to build metadata accordingly.
> Driver can declare several BTFs (e.g. a "generic" one with things
> like hashes and csums one and a custom one) and it would work either
> through dynamic registering or through /sys approach.

IMO driver needs to expose one single BTF image of what CO-RE ops
need to be done on a object.

Separate the config of hardware from the BPF infrastructure these
are two separate things.

> 
> This is all discussable anyways, we're happy to hear different
> opinions and thoughts to collectively choose the optimal way.
> 
> > -Toke
> 
> Thanks,
> Al



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux