On Fri, Oct 28, 2022 at 11:05 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > On Fri, 28 Oct 2022 08:58:18 -0700 John Fastabend wrote: > > A bit of extra commentary. By exposing the raw kptr to the rx > > descriptor we don't need driver writers to do anything. > > And can easily support all the drivers out the gate with simple > > one or two line changes. This pushes the interesting parts > > into userspace and then BPF writers get to do the work without > > bother driver folks and also if its not done today it doesn't > > matter because user space can come along and make it work > > later. So no scattered kernel dependencies which I really > > would like to avoid here. Its actually very painful to have > > to support clusters with N kernels and M devices if they > > have different features. Doable but annoying and much nicer > > if we just say 6.2 has support for kptr rx descriptor reading > > and all XDP drivers support it. So timestamp, rxhash work > > across the board. > > IMHO that's a bit of wishful thinking. Driver support is just a small > piece, you'll have different HW and FW versions, feature conflicts etc. > In the end kernel version is just one variable and there are many others > you'll already have to track. > > And it's actually harder to abstract away inter HW generation > differences if the user space code has to handle all of it. I've had the same concern: Until we have some userspace library that abstracts all these details, it's not really convenient to use. IIUC, with a kptr, I'd get a blob of data and I need to go through the code and see what particular type it represents for my particular device and how the data I need is represented there. There are also these "if this is device v1 -> use v1 descriptor format; if it's a v2->use this another struct; etc" complexities that we'll be pushing onto the users. With kfuncs, we put this burden on the driver developers, but I agree that the drawback here is that we actually have to wait for the implementations to catch up. Jakub mentions FW and I haven't even thought about that; so yeah, bpf programs might have to take a lot of other state into consideration when parsing the descriptors; all those details do seem like they belong to the driver code. Feel free to send it early with just a handful of drivers implemented; I'm more interested about bpf/af_xdp/user api story; if we have some nice sample/test case that shows how the metadata can be used, that might push us closer to the agreement on the best way to proceed. > > To find the offset of fields (rxhash, timestamp) you can use > > standard BTF relocations we have all this machinery built up > > already for all the other structs we read, net_devices, task > > structs, inodes, ... so its not a big hurdle at all IMO. We > > can add userspace libs if folks really care, but its just a read so > > I'm not even sure that is helpful. > > > > I think its nicer than having kfuncs that need to be written > > everywhere. My $.02 although I'll poke around with below > > some as well. Feel free to just hang tight until I have some > > code at the moment I have intel, mellanox drivers that I > > would want to support. > > I'd prefer if we left the door open for new vendors. Punting descriptor > parsing to user space will indeed result in what you just said - major > vendors are supported and that's it.