On Thu, May 19, 2022 at 04:11:36PM -0700, Jakub Kicinski wrote: > On Fri, 20 May 2022 00:02:01 +0200 Pablo Neira Ayuso wrote: > > To improve hardware offload debuggability and scalability introduce > > 'nf_flowtable_count_hw' and 'nf_flowtable_max_hw' sysctl entries in new > > dedicated 'net/netfilter/ft' namespace. Add new pernet struct nf_ft_net in > > order to store the counter and sysctl header of new sysctl table. > > > > Count the offloaded flows in workqueue add task handler. Verify that > > offloaded flow total is lower than allowed maximum before calling the > > driver callbacks. To prevent spamming the 'add' workqueue with tasks when > > flows can't be offloaded anymore also check that count is below limit > > before queuing offload work. This doesn't prevent all redundant workqueue > > task since counter can be taken by concurrent work handler after the check > > had been performed but before the offload job is executed but it still > > greatly reduces such occurrences. Note that flows that were not offloaded > > due to counter being larger than the cap can still be offloaded via refresh > > function. > > > > Ensure that flows are accounted correctly by verifying IPS_HW_OFFLOAD_BIT > > value before counting them. This ensures that add/refresh code path > > increments the counter exactly once per flow when setting the bit and > > decrements it only for accounted flows when deleting the flow with the bit > > set. > > Why a sysctl and not a netlink attr per table or per device? Per-device is not an option, because the flowtable represents a compound of devices. Moreover, in tc ct act the flowtable is not bound to a device, while in netfilter/nf_tables it is. tc ct act does not expose flowtables to userspace in any way, they internally allocate one flowtable per zone. I assume there os no good netlink interface for them. For netfilter/nftables, it should be possible to add per-flowtable netlink attributes, my plan is to extend the flowtable netlink attribute to add a flowtable maximum size. This sysctl count and limit hw will just work as a global limit (which is optional), my plan is that the upcoming per-flowtable limit will just override this global limit. I think it is a reasonable tradeoff for the different requirements of the flowtable infrastructure users given there are two clients currently for this code.