On Thu, Nov 23, 2023 at 10:53 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > On Thu, 23 Nov 2023 17:53:42 +0000 Edward Cree wrote: > > The kernel doesn't like to trust offload blobs from a userspace compiler, > > because it has no way to be sure that what comes out of the compiler > > matches the rules/tables/whatever it has in the SW datapath. > > It's also a support nightmare because it's basically like each user > > compiling their own device firmware. > Hi Jakub, > Practically speaking every high speed NIC runs a huge binary blob of FW. > First, let's acknowledge that as reality. > Yes. But we're also seeing a trend for programmable NICs. It's an interesting question as to how the kernel can leverage that programmability for the benefit of the user. > Second, there is no equivalent for arbitrary packet parsing in the > kernel proper. Offload means take something form the host and put it > on the device. If there's nothing in the kernel, we can't consider > the new functionality an offload. That's completely true, however I believe that eBPF has expanded our definition of "what's in the kernel". For instance, we can do arbitrary parsing in an XDP/eBPF program (in fact, it's still on my list of things to do to rip out Flow dissector C code and replace it with eBPF). (https://netdevconf.info/0x15/slides/16/Flow%20dissector_PANDA%20parser.pdf, https://www.youtube.com/watch?v=zVnmVDSEoXc&list=PLrninrcyMo3L-hsJv23hFyDGRaeBY1EJO) > > I understand that "we offload SW functionality" is our general policy, > but we should remember why this policy is in place, and not > automatically jump to the conclusion. > > > At least normally with device firmware the driver side is talking to > > something with narrow/fixed semantics and went through upstream > > review, even if the firmware side is still a black box. > > We should be buildings things which are useful and open (as in > extensible by people "from the street"). With that in mind, to me, > a more practical approach would be to try to figure out a common > and rigid FW interface for expressing the parsing graph. Parse graphs are best represented by declarative representation, not an imperative one. This is a main reason why I want to replace flow dissector, a parser written in imperative C code is difficult to maintain as evident by the myriad of bugs in that code (particularly when people added support or uncommon protocols). P4 got this part right, however I don't believe we need to boil the ocean by programming the kernel in a new language. A better alternative is to define an IR that contains for this purpose. We do that in Common Parser Language (CPL) which is a .json schema to describe parse graphs. With an IR we can compile into arbitrary backends including P4, eBPF, C, and even custom assembly instructions for parsing (arbitrary font ends languages are facilitated as well). (https://netdevconf.info/0x16/papers/11/High%20Performance%20Programmable%20Parsers.pdf) > > But that's an interface going from the binary blob to the kernel. > > > Just to prove I'm not playing favourites: this is *also* a problem with > > eBPF offloads like Nanotubes, and I'm not convinced we have a viable > > solution yet. > > BPF offloads are actual offloads. Config/state is in the kernel, > you need to pop it out to user space, then prove that it's what > user intended. Seems like offloading eBPF byte code and running a VM in the offload device is pretty much considered a non-starter. But, what if we could offload the _functionality_ of an eBPF program with confidence that the functionality _exactly_ matches that of the eBPF program running in the kernel? I believe that could be beneficial. For instance, we all know that LRO never gained traction. The reason is because each vendor does it however they want and no one can match the exact functionality that SW GRO provides. It's not an offload of kernel SW, so it's not viable. But, suppose we wrote GRO in some program that could be compiled into eBPF and a device binary. Using something like that hash technique I described, it seems like we could properly do a kernel offload of GRO where the offload functionality matches the software in the kernel. Tom