When the interface against which you match in the "raw prerouting" is enslaved in a VRF matching is different in the kernel 5.4 and kernels 5.10 and later (I have no systems to check kernels in between). On 5.4, veth interface is matched and zone is set accordingly, then vrf interface is matched again, rule is executed, according to trace, but once set zone does not change. On 5.10 and later, the rule that should match veth interface _does not appear in the trace_, despite trace shows the veth as the `iif` at that moment. Then the rule that matches vrf interface is executed, and corresponding zone is set. Reproducer script creates a veth pair with one end enslaved in a vrf, and sends a packet to the unenslaved end of the veth. In the prerouting chain, there are rules that set different conntrack zone depending on which iif matched - veth or vrf. As a result, entries are created in different zones when the script runs on earlier and on later kernels. Here are the results (observe different zones), and the script is below. ======== 5.4.86-pserver conntrack v1.4.5 (conntrack-tools): connection tracking table has been emptied. PING 172.30.30.2 (172.30.30.2) from 172.30.30.1 vein: 56(84) bytes of data. 64 bytes from 172.30.30.2: icmp_seq=1 ttl=64 time=0.128 ms --- 172.30.30.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.128/0.128/0.128/0.000 ms icmp 1 30 src=172.30.30.1 dst=172.30.30.2 type=8 code=0 id=13818 [UNREPLIED] src=172.30.30.2 dst=172.30.30.1 type=0 code=0 id=13818 mark=0 zone=1 use=1 conntrack v1.4.5 (conntrack-tools): 1 flow entries have been shown. ======== 5.13.0-16-generic conntrack v1.4.6 (conntrack-tools): connection tracking table has been emptied. PING 172.30.30.2 (172.30.30.2) from 172.30.30.1 vein: 56(84) bytes of data. 64 bytes from 172.30.30.2: icmp_seq=1 ttl=64 time=0.117 ms --- 172.30.30.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.117/0.117/0.117/0.000 ms icmp 1 30 src=172.30.30.1 dst=172.30.30.2 type=8 code=0 id=104 [UNREPLIED] src=172.30.30.2 dst=172.30.30.1 type=0 code=0 id=104 mark=0 zone=2 use=1 conntrack v1.4.6 (conntrack-tools): 1 flow entries have been shown. ======== #!/bin/sh IPIN=172.30.30.1 IPOUT=172.30.30.2 PFXL=30 ip li sh vein >/dev/null 2>&1 && ip li del vein ip li sh tvrf >/dev/null 2>&1 && ip li del tvrf nft list table testct >/dev/null 2>&1 && nft delete table testct ip li add vein type veth peer veout ip li add tvrf type vrf table 9876 ip li set veout master tvrf ip li set vein up ip li set veout up ip li set tvrf up sysctl -w net.ipv4.conf.veout.accept_local=1 ip addr add $IPIN/$PFXL dev vein ip addr add $IPOUT/$PFXL dev veout nft -f - <<__END__ table testct { chain rawpre { type filter hook prerouting priority raw; # iif { veout, tvrf } meta nftrace set 1 iif veout ct zone set 1 return iif tvrf ct zone set 2 return notrack } chain rawout { type filter hook output priority raw; notrack } } __END__ uname -r conntrack -F ping -W 1 -c 1 -I vein $IPOUT conntrack -L ======== Is this a known situation? Which behavior is "correct"? Thank you, Eugene
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature