Re: In raw prerouting, `iif` matches different interfaces in different kernels when enslaved in a vrf

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Florian,



On 02/10/2021 20:50, Florian Westphal wrote:



> Eugene Crosser <crosser@xxxxxxxxxxx> wrote:

>> Is this a known situation? Which behavior is "correct"?

>

> No idea, your reproducer gives this on my laptop:

>

>  unshare -n bash repro.sh

> net.ipv4.conf.veout.accept_local = 1

> 5.14.9-200.fc34.x86_64

> conntrack v1.4.5 (conntrack-tools): connection tracking table has been emptied.

> PING 172.30.30.2 (172.30.30.2) from 172.30.30.1 vein: 56(84) bytes of data.

>

> --- 172.30.30.2 ping statistics ---

> 1 packets transmitted, 0 received, 100% packet loss, time 0ms

>

> conntrack v1.4.5 (conntrack-tools): 0 flow entries have been shown.



It would seem that you have an existing filter that drops packets and prevents creation of conntrack entries? I can reproduce the behaviour on freshly installed Debian and Ubuntu VMs without any modifications, with and without `unshare`.



>

> A bisection is needed to figure out what introduced a change.

>

> However, if this is already changeed for a few releases then we can't

> revert it again.



I think that behaviour change is not benign though. If you have several interfaces enslaved in one VRF, (which is a normal configuration), you can no longer create rules that depend on the specific interface from which the packet arrived.



So far I was able to prove that it depends on the kernel version and nothing else. I've installed debian bullseye on a fresh VM, and upgraded it to debian sid. The VM now has two kernels: 5.10.0-8 and 5.14.0-2 (debian builds). When booted with the older kernel, my reproducer shows "correct" behaviour (rule matches the original veth), when booted with the newer kernel, behaviour is altered (rule matches VRF instead).



I also updated the reproducer to write nftrace, and it looks "interesting". I am including the new reproducer below, and I can send nftrace files if needed.



Now I am trying to bisect upstream kernel.



Thanks.



==========



#!/bin/sh



IPIN=172.30.30.1

IPOUT=172.30.30.2

PFXL=30



ip li sh vein >/dev/null 2>&1 && ip li del vein

ip li sh tvrf >/dev/null 2>&1 && ip li del tvrf

nft list table testct >/dev/null 2>&1 && nft delete table testct



ip li add vein type veth peer veout

ip li add tvrf type vrf table 9876

ip li set veout master tvrf

ip li set vein up

ip li set veout up

ip li set tvrf up

/sbin/sysctl -w net.ipv4.conf.veout.accept_local=1

ip addr add $IPIN/$PFXL dev vein

ip addr add $IPOUT/$PFXL dev veout



nft -f - <<__END__

table testct {

	chain rawpre {

		type filter hook prerouting priority raw;

		iif { veout, tvrf } meta nftrace set 1

		iif veout ct zone set 1 return

		iif tvrf ct zone set 2 return

		notrack

	}

	chain rawout {

		type filter hook output priority raw;

		notrack

	}

}

__END__



uname -rv

conntrack -F

stdbuf -o0 nft monitor trace >nftrace.`uname -r`.txt &

monpid=$!

ping -W 1 -c 1 -I vein $IPOUT

conntrack -L

sleep 1

kill -15 $monpid

wait
========

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux