Lahav Schlesinger <lschlesinger@xxxxxxxxxxxxx> wrote: > The call to nf_reset_ct() I added was to match the existing call in the > egress flow, which I didn't want to change in order to not break > existing behaviour (which I unintentionally still did :-)). > > Seems like any combination of calling nf_reset_ct() will lead to > something breaking. So continuing on what Florian suggested, another > possibility is to make the calls to nf_reset_ct() in both ingress and egress > flow configurable (procfs or new flags to RTM_NEWLINK). > > One benefit of this is that disabling nf_reset_ct() on the egress flow will > mean no port SNAT will take place when SNAT rule is installed on a VRF > (as I described in my original commit), which can break applications > that depend on using a specific source port. Looking at the original change, eb63ecc1706b3e094d0f57438b6c2067cfc299f2 "net: vrf: Drop conntrack data after pass through VRF device on Tx", I wonder if thats not the real cause of the problem. ========================= Locally originated traffic in a VRF fails in the presence of a POSTROUTING rule. For example, $ iptables -t nat -A POSTROUTING -s 11.1.1.0/24 -j MASQUERADE $ ping -I red -c1 11.1.1.3 ping: Warning: source address might be selected on device other than red. PING 11.1.1.3 (11.1.1.3) from 11.1.1.2 red: 56(84) bytes of data. ping: sendmsg: Operation not permitted ========================= I think we first need selftest scripts that re-creates the three scenarios the one reported by Eugene, the one outlined above and the double-PAT one Lahav fixed before any code changes are tested. Its tempting to just change the nf_ct_reset() done on egress to be conditional on the ct->status snat bit & drop support for double-snat. Given Lahavs patch, double-snat probably never worked to begin with?