> On Wed, 28 Sep 2022 16:02:43 +0200 Maximilien Cuony wrote: > > However when the issue is present, the SYNACK does arrives on eth2, but > is never "unNATed" back to eth1: > > 10:25:07.644433 eth0 Out IP 192.168.5.1.48684 > 99.99.99.99.80: Flags > [S], seq 3207393154 > 10:25:07.644782 eth1 In IP 192.168.5.1.48684 > 99.99.99.99.80: Flags > [S], seq 3207393154 > 10:25:07.644793 eth2 Out IP 192.168.1.1.48684 > 99.99.99.99.80: Flags > [S], seq 3207393154 > 10:25:07.668551 eth2 In IP 54.36.61.42.80 > 192.168.1.1.48684: Flags > [S.], seq 823335485, ack 3207393155 > > The issue is only with TCP connections. UDP or ICMP works fine. > > Turing off net.ipv4.tcp_l3mdev_accept back to 0 also fix the issue, but > we need this flag since we use some sockets that does not understand VRFs. > > We did have a look at the diff and the code of inet_bound_dev_eq, but we > didn't understand much the real problem - but it does seem now that > bound_dev_if if now checked not to be False before the bound_dev_if == > dif || bound_dev_if == sdif comparison, something that was not the case > before (especially since it's dependent on l3mdev_accept). > > Maybe our setup is wrong and we should not be able to route packets like > that? > > Thanks a lot and have a nice day! > > Maximilien Cuony Hi Maximilien, Apologies that you have now hit this issue. Further to David's reply with the link for the rationale behind the change, the bisected commit you found restores backwards compatibility with the 4.19 kernel to allow a match on an unbound socket when in a VRF if tcp_l3mdev_accept=1, the absence of this causing issues for others. Isolation between default and other VRFs as introduced by the team I worked for back in 2018 and introduced in 5.x kernels remains guaranteed if tcp_l3mdev_accept=0. There is no appetite so far to introduce yet another kernel parameter to control this specific behavior, see e.g. https://lore.kernel.org/netdev/f174108c-67c5-3bb6-d558-7e02de701ee2@xxxxxxxxx/ Is there any possibility that you could use tcp_l3mdev_accept=0 by running any services needed in the VRF with 'ip vrf exec <vrf> <cmd>'? Is the problem specific to using NAT for eth2 in the VRF, i.e. have you tried on another interface in that VRF, or on eth2 without NAT config? While match on an unbound socket in the VRF is now possible again and somehow causing the issue, I would have thought that a bound socket should still be chosen due to it having a higher score c.f. compute_score(). No doubt you are doing this, but can I also check that your VRF config is correct according to https://www.kernel.org/doc/Documentation/networking/vrf.txt , so reducing the local lookup preference, etc., e.g. ip route add table 1200 unreachable default metric 4278198272 ip -6 route add table 1200 unreachable default metric 4278198272 ip rule add pref 32765 from all lookup local ip rule del pref 0 from all lookup local (and check output of 'ip rule' & 'ip route ls vrf firewall', no need to reply with this) Thanks Mike