Since the inevitable LWN article has been written, let me put more detail into what I already mentioned here: https://lore.kernel.org/all/20240301090020.7c9ebc1d@xxxxxxxxxx/ for the benefit of non-networking people. On Wed, 10 Apr 2024 10:01:26 -0400 Jamal Hadi Salim wrote: > P4TC builds on top of many years of Linux TC experiences of a netlink > control path interface coupled with a software datapath with an equivalent > offloadable hardware datapath. The point of having SW datapath is to provide a blueprint for the behavior. This is completely moot for P4 which comes as a standard. Besides we already have 5 (or more) flow offloads, we don't need a 6th, completely disconnected from the existing ones. Leaving users guessing which one to use, and how they interact. In my opinion, reasonable way to implement programmable parser for Linux is: 1. User writes their parser in whatever DSL they want 2. User compiles the parser in user space 2.1 Compiler embeds a representation of the graph in the blob 3. User puts the blob in /lib/firmware 4. devlink dev $dev reload action parser-fetch $filename 5. devlink loads the file, parses it to extract the representation from 2.1, and passes the blob to the driver 5.1 driver/fw reinitializes the HW parser 5.2 user can inspect the graph by dumping the common representation from 2.1 (via something like devlink dpipe, perhaps) 6. The parser tables are annotated with Linux offload targets (routes, classic ntuple, nftables, flower etc.) with some tables being left as "raw"* (* better name would be great) 7. ethtool ntuple is extended to support insertion of arbitrary rules into the "raw" tables 8. The other tables can only be inserted into using the subsystem they are annotated for This builds on how some devices _already_ operate. Gives the benefits of expressing parser information and ability to insert rules for uncommon protocols also for devices which are not programmable. And it uses ethtool ntuple, which SW people actually want to use. Before the tin foil hats gather - we have no use for any of this at Meta, I'm not trying to twist the design to fit the use cases of big bad hyperscalers.