Patrick McHardy wrote: > bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: >> https://bugzilla.kernel.org/show_bug.cgi?id=16317 >> [581172.269340] ------------[ cut here ]------------ >> [581172.280485] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:300! >> > > NAT is attempting to set up mappings a second time for an existing > conntrack. > > So the failover node is purely passive and is not synchronizing connections > back to the one which is crashing? That would rule out a race condition > between creating a new conntrack using ctnetlink and the lookup done during > packet processing. Syncing is done in both directions simultaneously so the described race is not ruled out. Coincidentally or not, but so far both crashes seemed to have occured on the 6th second of a minute, which is around where conntrackd -c usually finishes. I'm a bit confused how the race might happen. It would mean that the src/dst ip:port gets reused or packet tranmitted by client after the conntrack has expired on the active box whilist the failover box synchronizes it back to the active one? > I can't spot the problem right now, but it would be interesting whether > this still happens without running the (synchronizing) conntrack daemon. I can't keep this running in production so will have to try to reproduce it on a test setup. As I'm not sure about the scenario to test, I'll just create lots of SNAT/DNAT connections while syncing them with conntrackd (and conntrackd -c) running for a while hoping to recreate whatever triggers it? Siim -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html