Hi, Consider ruleset such as: table inet filter { chain forward { type filter hook forward priority filter; policy accept; ip dscp set cs3 ct state established,related accept } } As expected, all of the packets from 10.0.2.99 to 10.0.1.99 have IPv4 tos field changed to 0x60: ... 13:36:42.474591 fe:dc:b3:e2:dc:3b > 5a:45:4d:2a:25:65, ethertype IPv4 (0x0800), length 1090: (tos 0x60, ttl 62, id 39855, offset 0, flags [none], proto TCP (6), length 1076) 10.0.2.99.12345 > 10.0.1.99.44084: Flags [P.], cksum 0x1bec (incorrect -> 0x44c3), seq 1:1025, ack 1025, win 1987, options [nop,nop,TS val 2854899766 ecr 3249774499], length 1024 ... Now lets try to add flow offload: table inet filter { flowtable f1 { hook ingress priority filter devices = { veth0, veth1 } } chain forward { type filter hook forward priority filter; policy accept; ip dscp set cs3 ip protocol { tcp, udp, gre } flow add ct state established,related accept } } Although some of the packets still have their TOS being correct, some are not: ... 13:41:17.138782 5e:d5:1f:a3:ba:d1 > d2:d2:73:e6:5b:92, ethertype IPv4 (0x0800), length 1090: (tos 0x0, ttl 62, id 20142, offset 0, flags [none], proto TCP (6), length 1076) 10.0.2.99.12345 > 10.0.1.99.34230: Flags [P.], cksum 0x1bec (incorrect -> 0xc090), seq 1:1025, ack 1, win 2009, options [nop,nop,TS val 2855174430 ecr 3250049157], length 1024 ... The root cause for the bug seems to be that nft_payload_set_eval (which sets the dscp tos field) isn't being called on the offload fast path in nf_flow_offload_ip_hook. The fix in this patch series is to have payload modifications recorded in the new conntrack extension. Then we apply those modifications on the fast path. To signal intent to record payload changes, we add offload flag to the nft userspace tool (separate patches follow). For example the dscp set line becomes: .... ip dscp set cs3 offload ... Some high level description of the patches: * patches 1-4 fix small but annoying infelicities in nft_flowtable.sh test script * patches 5-7 export payload modification functionality from nft_payload.c * patches 8-10 add new NFT_PAYLOAD_CAN_OFFLOAD flag being set by the userspace * patches 11-13 are technical changes to add the new conntrack extension * patches 14-16 add payload context to the conntrack and apply them on the fast path * patches 17-18 save the payload context if NFT_PAYLOAD_CAN_OFFLOAD flag is set. * patch 19 adds dscp modification offload test to the nft_payload.sh test script. Thanks, Boris. Boris Sukholitko (19): selftest: netfilter: use /proc for pid checking selftest: netfilter: no need for ps -x option selftest: netfilter: wait for specific nc pids selftest: netfilter: monitor result file sizes netfilter: nft_payload: refactor mangle operation netfilter: nft_payload: publish nft_payload_set netfilter: nft_payload: export mangle netfilter: nft_payload: use flag for checksum need netfilter: nft_payload: add offload flag define netfilter: nft_payload: allow offload in the netlink netfilter: conntrack: nft extension Kconfig netfilter: nft: empty nft conntrack extension netfilter: conntrack: register nft extension netfilter: nft: add payload context into extension netfilter: nft: add payload application netfilter: nftables: fast path payload mangle netfilter: nftables: payload save mechanism netfilter: nft_payload: save payload if needed selftests: netfilter: dscp offload test include/net/netfilter/nf_conntrack_extend.h | 3 + include/net/netfilter/nf_tables.h | 68 +++++++++++++++++++ include/uapi/linux/netfilter/nf_tables.h | 1 + net/netfilter/Kconfig | 10 +++ net/netfilter/Makefile | 2 + net/netfilter/nf_conntrack_core.c | 2 + net/netfilter/nf_conntrack_extend.c | 9 ++- net/netfilter/nf_conntrack_netlink.c | 2 + net/netfilter/nf_flow_table_ip.c | 3 + net/netfilter/nft_conntrack_ext.c | 56 +++++++++++++++ net/netfilter/nft_payload.c | 46 +++++++------ .../selftests/netfilter/nft_flowtable.sh | 61 +++++++++++++++-- 12 files changed, 237 insertions(+), 26 deletions(-) create mode 100644 net/netfilter/nft_conntrack_ext.c -- 2.32.0
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature