Greetings. I've been trying for some time now to get a net-to-net VPN to work, but I'm running into some kind of packet loss or misdirection on my gateways. Here is the setup: eth0 eth0 eth1 eth0 eth1 eth0 10.1.1.2---10.1.1.1--A.B.C.D======E.F.G.H--10.2.1.1---10.2.1.2 client1 gateway1 gateway2 client2 So let's say I try to ping client2 from client1. With tcpdump I can watch the packets go through the external interface of gateway1. Each ping seems to produce six packets: 06:21:57.322443 IP (tos 0x0, ttl 63, id 29281, offset 0, flags [DF], proto 51, length: 180) A.B.C.D > E.F.G.H: AH(spi=0x059f0f7a,sumlen=16,seq=0x3a): IP (tos 0x0, ttl 64, id 15411, offset 0, flags [DF], proto 50, length: 136) A.B.C.D > E.F.G.H: ESP(spi=0x0a3f13cf,seq=0x3a) 06:21:57.322443 IP (tos 0x0, ttl 64, id 15411, offset 0, flags [DF], proto 50, length: 136) A.B.C.D > E.F.G.H: ESP(spi=0x0a3f13cf,seq=0x3a) 06:21:57.322443 IP (tos 0x0, ttl 63, id 57, offset 0, flags [DF], proto 1, length: 84) 10.95.244.2 > 10.95.211.2: icmp 64: echo request seq 57 06:21:57.342832 IP (tos 0x0, ttl 64, id 227, offset 0, flags [DF], proto 17, length: 544) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid : phase 2/others ? oakley-quick[E]: [encrypted hash] 06:21:57.385329 IP (tos 0x0, ttl 63, id 114, offset 0, flags [DF], proto 17, length: 360) A.B.C.D.isakmp > E.F.G.H.isakmp: isakmp 1.0 msgid : phase 2/others ? oakley-quick[E]: [encrypted hash] 06:21:57.386213 IP (tos 0x0, ttl 64, id 228, offset 0, flags [DF], proto 17, length: 88) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid : phase 2/others ? oakley-quick[E]: [encrypted hash] More often it's only four packets, though, where the packets of length 136 and 84 are not there. On gateway2, each ping seems to produce only three packets, so it's missing the ones with a length of 180, 136, and 84: 13:03:31.355526 IP (tos 0x0, ttl 63, id 1949, offset 0, flags [DF], proto 17, length: 544) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid : phase 2/others ? oakley-quick[E]: [encrypted hash] 13:03:31.399320 IP (tos 0x0, ttl 64, id 1166, offset 0, flags [DF], proto 17, length: 360) A.B.C.D.isakmp > E.F.G.H.isakmp: isakmp 1.0 msgid : phase 2/others ? oakley-quick[E]: [encrypted hash] 13:03:31.401131 IP (tos 0x0, ttl 63, id 1950, offset 0, flags [DF], proto 17, length: 88) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid : phase 2/others ? oakley-quick[E]: [encrypted hash] On client2, all appears to be well. It receives a ping and responds to it. But client1 never receives that response, and sniffing eth0 (the internal interface) on gateway1 shows the outgoing ping but no ping response being sent back to client1. Now I reverse it, trying to ping client1 from client2. gateway2 behaves the same way, still missing the 180, 136, and 84 packets: 06:13:11.805316 IP (tos 0x0, ttl 64, id 97, offset 0, flags [DF], proto 17, length: 544) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid : phase 2/others I oakley-quick[E]: [encrypted hash] 06:13:11.847381 IP (tos 0x0, ttl 63, id 49, offset 0, flags [DF], proto 17, length: 360) A.B.C.D.isakmp > E.F.G.H.isakmp: isakmp 1.0 msgid : phase 2/others R oakley-quick[E]: [encrypted hash] 06:13:11.848253 IP (tos 0x0, ttl 64, id 98, offset 0, flags [DF], proto 17, length: 88) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid : phase 2/others I oakley-quick[E]: [encrypted hash] ...and now it has gateway1 behaving the same way too: 14:08:27.937838 IP (tos 0x0, ttl 63, id 81, offset 0, flags [DF], proto 17, length: 544) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid : phase 2/others I oakley-quick[E]: [encrypted hash] 14:08:27.980288 IP (tos 0x0, ttl 64, id 41, offset 0, flags [DF], proto 17, length: 360) A.B.C.D.isakmp > E.F.G.H.isakmp: isakmp 1.0 msgid : phase 2/others R oakley-quick[E]: [encrypted hash] 14:08:27.981904 IP (tos 0x0, ttl 63, id 82, offset 0, flags [DF], proto 17, length: 88) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid : phase 2/others I oakley-quick[E]: [encrypted hash] No ping is received by client1 in this case, so the tunnel is even less fully formed than it is in the original scenario. One interesting thing I noticed is the kernel routing table. gateway1 uses eth1 as the external interface, and gateway2 uses eth0, but the /etc/sysconfig/network-scripts/ifup script seems to choose the last one it brings up for the APIPA destination, regardless of whether it's the internal or external interface: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface A.B.C.0 * 255.255.255.0 U 0 0 0 eth1 10.1.1.0 * 255.255.255.0 U 0 0 0 eth0 169.254.0.0 * 255.255.0.0 U 0 0 0 eth1 default I.J.K.L 0.0.0.0 UG 0 0 0 eth1 Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface E.F.G.0 * 255.255.255.0 U 0 0 0 eth0 10.2.1.0 * 255.255.255.0 U 0 0 0 eth1 169.254.0.0 * 255.255.0.0 U 0 0 0 eth1 default M.N.O.P 0.0.0.0 UG 0 0 0 eth0 Also there's no lo interface, which seems odd to me since it shows up when I run ifconfig. How either of these would effect my VPN I'm not sure, though I do see messages like these in syslog that make me wonder: racoon: INFO: 127.0.0.1[500] used as isakmp port (fd=10) I tried setting the APIPA destination so that it was the external interface on both machines with '/sbin/ip route replace 169.254.0.0/16 dev eth0' on E.F.G.H. If I repeat the client1 to client2 ping after this, the only change is that gateway2 is no longer missing the 180 length packet. If I change the APIPA destination to the internal interface on both machines and repeat the client1 to client2 ping, gateway2 shows all six of the original packets, but poor gateway1 now only shows packets with a 180 length. Doing a client2 to client1 ping with the internal interfaces as the APIPA destination shows both gateway1 and gateway2 missing the 180, 136, and 84 packets, and client1 never receives the ping. Other notes: 1. Both gateway boxes are running ipsec-tools-0.3.3-6 and kernel 2.6.9-5.0.3.EL. 2. I've been able to repeat this whole scenario by putting in a third gateway and client at a totally different location and having it try to VPN to either of the other gateways. 3. I've added rules to iptables to make it log packets before dropping them, but it's not showing anything being dropped. 4. racoon configuration on both gateways has been carefully checked to be sure keys, algorithms, etc are in sync. 5. NAT is working for the clients. They can reach the Internet through the gateways. I'm about ready to throw up my hands. What am I missing? Matthew -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list