Re: XDP-hints: Howto support multiple BTF types per packet basis?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 28 May 2021 07:35:34 -0700
John Fastabend <john.fastabend@xxxxxxxxx> wrote:

> Toke Høiland-Jørgensen wrote:
> > John Fastabend <john.fastabend@xxxxxxxxx> writes:
> >   
> > >> > > union and independent set of BTFs are two different things, I'll let
> > >> > > you guys figure out which one you need, but I replied how it could
> > >> > > look like in CO-RE world  
> > >> >
> > >> > I think a union is sufficient and more aligned with how the
> > >> > hardware would actually work.  
> > >> 
> > >> Sure. And I think those are two orthogonal concerns. You can start
> > >> with a single struct mynic_metadata with union inside it, and later
> > >> add the ability to swap mynic_metadata with another
> > >> mynic_metadata___v2 that will have a similar union but with a
> > >> different layout.  
> > >
> > > Right and then you just have normal upgrade/downgrade problems with
> > > any struct.
> > >
> > > Seems like a workable path to me. But, need to circle back to the
> > > what we want to do with it part that Jesper replied to.  
> > 
> > So while this seems to be a viable path for getting libbpf to do all the
> > relocations (and thanks for hashing that out, I did not have a good grip
> > of the details), doing it all in userspace means that there is no way
> > for the XDP program to react to changes once it has been loaded. So this
> > leaves us with a selection of non-very-attractive options, IMO. I.e.,
> > we would have to:  
> 
> I don't really understand what this means 'having XDP program to
> react to changes once it has been loaded.' What would a program look
> like thats dynamic? You can always version your metadata and
> write programs like this,
> 
>   if (meta->version == VERSION1) {do_foo}
>   else {do_bar}
> 
> And then have a headeer,
> 
>    struct meta {
>      int version;
>      union ...    // union of versions
>    }
> 
> I fail to see how a program could 'react' dynamically. An agent could
> load new programs dynamically into tail call maps of fentry with
> the need handlers, which would work as well and avoid unions.
> 
> > 
> > - have to block any modifications to the hardware config that would
> >   change the metadata format; this will probably result in irate users  
> 
> I'll need a concrete example if I swap out my parser block, I should
> also swap out my BPF for my shiny new protocol. I don't see how a
> user might write programs for things they've not configured hardware
> for yet. Leaving aside knobs like VLAN on/off, VXLAN on/off, and
> such which brings the next point.
> 
> > 
> > - require XDP programs to deal with all possible metadata permutations
> >   supported by that driver (by exporting them all via a BTF union or
> >   similar); this means a potential for combinatorial explosion of config
> >   options and as NICs become programmable themselves I'm not even sure
> >   if it's possible for the driver to know ahead of time  
> 
> I don't see the problem sorry. For current things that exist I can't
> think up too many fields vlan, timestamp, checksum(?), pkt_type,
> hash maybe.
> 
> For programmable pipelines (P4) then I don't see a problem with
> reloading your program or swapping out a program. I don't see the
> value of adding a new protocol for example dynamically. Surely
> the hardware is going to get hit with a big reset anyways.
> 
> > 
> > - throw up our hands and just let the user deal with it (i.e., to
> >   nothing and so require XDP programs to be reloaded if the NIC config
> >   changes); this is not very friendly and is likely to lead to subtle
> >   bugs if an XDP program parses the metadata assuming it is in a
> >   different format than it is  
> 
> I'm not opposed to user error causing logic bugs.  If I give
> users power to reprogram their NICs they should be capabable
> of managing a few BPF programs. And if not then its a space
> where a distro/vendor should help them with tooling.
> 
> > 
> > Given that hardware config changes are not just done by ethtool, but
> > also by things like running `tcpdump -j`, I really think we have to
> > assume that they can be quite dynamic; which IMO means we have to solve
> > this as part of the initial design. And I have a hard time seeing how
> > this is possible without involving the kernel somehow.  
> 
> I guess here your talking about building an skb? Wouldn't it
> use whatever logic it uses today to include the timestamp.
> This is a bit of an aside from metadata in the BPF program.
> 
> Building timestamps into
> skbs doesn't require BPF program to have the data. Or maybe
> the point is an XDP variant of tcpdump would like timestamps.
> But then it should be in the metadata IMO.

It sounds like we are all agreeing that the HW RX timestamp should be
stored in the XDP-metadata area right? 

As I understand, John don't like multiple structs, but want a single
struct, so lets create below simple struct that the driver code fills
out before calling our XDP-prog:

 struct meta {
	u32 timestamp_type;
	u64 rx_timestamp;
	u32 rxhash32;
	u32 version;
 };

This NIC is configured for PTP, but hardware can only do rx_timestamp
for PTP packets (like ixgbe).  (Assume both my XDP-prog and PTP
userspace prog want to see this HW TS).

What should I do as a driver developer to tell XDP-prog that the HW
rx_timestamp is not valid for this packet ?

 1. Always clear 'rx_timestamp' + 'timestamp_type' for non-PTP packets?
 2. or, set version to something else ?

I don't like option 1, because it will slowdown the normal none-PTP
packets, that doesn't need this timestamp.



Now I want to redirect this packet into a veth.  The veth device could
be running an XDP-prog, that also want to look at this timestamp info.
How can the veth XDP-prog know howto interpret metadata area. What if I
stored the bpf_id in the version fields in the struct?.

(Details: I also need this timestamp info transferred to xdp_frame,
because when redirecting into a veth (container) then I want this
timestamp set on the SKB to survive. I wonder how can I know what the
BTF-layout, guess it would be useful to have btf_id at this point)

> > 
> > Unless I'm missing something? WDYT?  
> 
> Distilling above down. I think we disagree on how useful
> dynamic programs are because of two reasons. First I don't
> see a large list of common attributes that would make the
> union approach as painful as you fear. And two, I believe
> users who are touching core hardware firmware need to also
> be smart enough (or have smart tools) to swap out their
> BPF programs in the correct order so as to not create
> subtle races. I didn't do it here but if we agree walking
> through that program swap flow with firmware update would
> be useful.

Hmm, I sense we are perhaps talking past each-other(?).  I am not
talking about firmware upgrades.  I'm arguing that enable/disable of HW
RX-timestamps will change the XDP-metadata usage dynamically runtime.
This is simply a normal userspace program that cause this changes e.g.
running 'tcpdump -j'.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux