Hi, This patchset fixes the indirect flow_block support for the tc CT action offload. Please, note that this batch is probably slightly large for the net tree, however, I could not find a simple incremental fix. = The problem The nf_flow_table_indr_block_cb() function provides the tunnel netdevice and the indirect flow_block driver callback. From this tunnel netdevice, it is not possible to obtain the tc CT flow_block. Note that tc qdisc and netfilter backtrack from the tunnel netdevice to the tc block / netfilter chain to reach the flow_block object. This allows them to clean up the hardware offload rules if the tunnel device is removed. = What is the indirect flow_block infrastructure? The indirect flow_block infrastructure allows drivers to offload tc/netfilter rules that belong to software tunnel netdevices, e.g. vxlan. This indirect flow_block infrastructure relates tunnel netdevices with drivers because there is no obvious way to relate these two things from the control plane. = How does the indirect flow_block work before this patchset? Front-ends register the indirect flow_block callback through flow_indr_add_block_cb() if they support for offloading tunnel netdevices. == Setting up an indirect flow_block 1) Drivers track tunnel netdevices via NETDEV_{REGISTER,UNREGISTER} events. If there is a new tunnel netdevice that the driver can offload, then the driver invokes __flow_indr_block_cb_register() with the new tunnel netdevice and the driver callback. The __flow_indr_block_cb_register() call iterates over the list of the front-end callbacks. 2) The front-end callback sets up the flow_block_offload structure and it invokes the driver callback to set up the flow_block. 3) The driver callback now registers the flow_block structure and it returns the flow_block back to the front-end. 4) The front-end gets the flow_block object and it is now ready to offload rules for this tunnel netdevice. A simplified callgraph is represented below. Front-end Driver NETDEV_REGISTER | __flow_indr_block_cb_register(netdev, cb_priv, driver_cb) | [1] .---------- frontend_indr_block_cb(cb_priv, driver_cb) | setup_flow_block_offload(bo) | [2] driver_cb(bo, cb_priv) ---------------. | set up flow_blocks [3] | add rules to flow_block <-------------' TC_SETUP_CLSFLOWER [4] == Releasing the indirect flow_block There are two possibilities, either tunnel netdevice is removed or a netdevice (port representor) is removed. === Tunnel netdevice is removed Driver waits for the NETDEV_UNREGISTER event that announces the tunnel netdevice removal. Then, it calls __flow_indr_block_cb_unregister() to remove the flow_block and rules. Callgraph is very similar to the one described above. === Netdevice is removed (port representor) Driver calls __flow_indr_block_cb_unregister() to remove the existing netfilter/tc rule that belong to the tunnel netdevice. = How does the indirect flow_block work after this patchset? Drivers register the indirect flow_block setup callback through flow_indr_dev_register() if they support for offloading tunnel netdevices. == Setting up an indirect flow_block 1) Frontends check if dev->netdev_ops->ndo_setup_tc is unset. If so, frontends call flow_indr_dev_setup_offload(). This call invokes the drivers' indirect flow_block setup callback. 2) The indirect flow_block setup callback sets up a flow_block structure which relates the tunnel netdevice and the driver. 3) The front-end uses flow_block and offload the rules. Note that the operational to set up (non-indirect) flow_block is very similar. == Releasing the indirect flow_block === Tunnel netdevice is removed This calls flow_indr_dev_setup_offload() to set down the flow_block and remove the offloaded rules. This alternate path is exercised if dev->netdev_ops->ndo_setup_tc is unset. === Netdevice is removed (port representor) If a netdevice is removed, then it might need to to clean up the offloaded tc/netfilter rules that belongs to the tunnel netdevice: 1) The driver invokes flow_indr_dev_unregister() when a netdevice is removed. 2) This call iterates over the existing indirect flow_blocks and it invokes the cleanup callback to let the front-end remove the tc/netfilter rules. The cleanup callback already provides the flow_block that the front-end needs to clean up. Front-end Driver | flow_indr_dev_unregister(...) | iterate over list of indirect flow_block and invoke cleanup callback | .----------------------------- | . frontend_flow_block_cleanup(flow_block) . | \/ remove rules to flow_block TC_SETUP_CLSFLOWER = About this patchset This patchset aims to address the existing TC CT problem while simplifying the indirect flow_block infrastructure. Saving 300 LoC in the flow_offload core and the drivers. The operational gets aligned with the (non-indirect) flow_blocks logic. Patchset is composed of: Patch #1 add nf_flow_table_gc_cleanup() which is required by the netfilter's flowtable new indirect flow_block approach. Patch #2 adds the flow_block_indr object which is actually part of of the flow_block object. This stores the indirect flow_block metadata such as the tunnel netdevice owner and the cleanup callback (in case the tunnel netdevice goes away). This patch adds flow_indr_dev_{un}register() to allow drivers to offer netdevice tunnel hardware offload to the front-ends. Then, front-ends call flow_indr_dev_setup_offload() to invoke the drivers to set up the (indirect) flow_block. Patch #3 add the tcf_block_offload_init() helper function, this is a preparation patch to adapt the tc front-end to use this new indirect flow_block infrastructure. Patch #4 updates the tc and netfilter front-ends to use the new indirect flow_block infrastructure. Patch #5 updates the mlx5 driver to use the new indirect flow_block infrastructure. Patch #6 updates the nfp driver to use the new indirect flow_block infrastructure. Patch #7 updates the bnxt driver to use the new indirect flow_block infrastructure. Patch #8 removes the indirect flow_block infrastructure version 1, now that frontends and drivers have been translated to version 2 (coming in this patchset). Please, apply. Pablo Neira Ayuso (8): netfilter: nf_flowtable: expose nf_flow_table_gc_cleanup() net: flow_offload: consolidate indirect flow_block infrastructure net: cls_api: add tcf_block_offload_init() net: use flow_indr_dev_setup_offload() mlx5: update indirect block support nfp: update indirect block support bnxt_tc: update indirect block support net: remove indirect block netdev event registration drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 - drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 51 +-- .../net/ethernet/mellanox/mlx5/core/en_rep.c | 83 +---- .../net/ethernet/mellanox/mlx5/core/en_rep.h | 5 - .../net/ethernet/netronome/nfp/flower/main.c | 11 +- .../net/ethernet/netronome/nfp/flower/main.h | 7 +- .../ethernet/netronome/nfp/flower/offload.c | 35 +- include/net/flow_offload.h | 28 +- include/net/netfilter/nf_flow_table.h | 2 + net/core/flow_offload.c | 301 +++++++----------- net/netfilter/nf_flow_table_core.c | 6 +- net/netfilter/nf_flow_table_offload.c | 85 +---- net/netfilter/nf_tables_offload.c | 69 ++-- net/sched/cls_api.c | 157 +++------ 14 files changed, 251 insertions(+), 590 deletions(-) -- 2.20.1