so, i have a pretty complex (for me, that is) setup on this one machine that acts as a nameserver and mail server and some other stuff and answers to a handful of ips. it's also a "real server" behind an lvs director. the machine in question is running a modified redhat 6.2 with a 2.2.17ext3 kernel (stock 2.2.17 + ext3 patches + nfs patches). let me try to describe this as best i can. our external network is 64.211.224.160/28. 161 is the router/gateway to the rest of the world. 162 is an auth nameserver. 163 is an auth nameserver. 164 is the ip used for outgoing connections from behind masquerading. 165 is for web traffic. 166 is for incoming mail. and i just put 169 in as a standalone machine. the 164 masquerading server allows the nameserver/mailserver to send requests to the outside world: MASQ all ------ 192.168.1.21 0.0.0.0/0 n/a the lvs director basically handles all incoming traffic and forwards it to the right place: IP Virtual Server version 1.0.0-beta1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 64.211.224.165:443 lc persistent 360 -> 192.168.1.101:443 Route 1 0 0 -> 192.168.1.102:443 Route 1 0 0 UDP 64.211.224.162:53 lc -> 192.168.1.11:53 Route 1 0 349 UDP 64.211.224.163:53 lc -> 192.168.1.12:53 Route 1 0 183 TCP 64.211.224.163:53 lc -> 192.168.1.12:53 Route 1 0 0 TCP 64.211.224.162:53 lc -> 192.168.1.11:53 Route 1 0 0 TCP 64.211.224.166:22 lc -> 192.168.1.21:22 Route 1 0 0 TCP 64.211.224.168:22 lc -> 192.168.1.21:22 Route 1 16 0 TCP 64.211.224.166:25 lc -> 192.168.1.21:25 Route 1 0 0 TCP 64.211.224.165:80 lc -> 192.168.1.101:80 Route 1 0 3 -> 192.168.1.102:80 Route 1 0 1 then there's the "phl" machine which handles dns and mail: [root@xxx /root]# /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48 inet addr:192.168.1.21 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:24535885 errors:0 dropped:0 overruns:0 frame:0 TX packets:24655159 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 Interrupt:11 Base address:0x2800 eth0:0 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48 inet addr:192.168.1.11 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x2800 eth0:1 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48 inet addr:192.168.1.12 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x2800 eth0:2 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48 inet addr:192.168.1.13 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x2800 eth0:3 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48 inet addr:192.168.1.14 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x2800 eth0:4 Link encap:Ethernet HWaddr 00:D0:B7:65:EC:48 inet addr:192.168.1.10 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x2800 eth1 Link encap:Ethernet HWaddr 00:C0:95:E2:85:40 inet addr:192.168.2.21 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:20102464 errors:0 dropped:0 overruns:0 frame:0 TX packets:19892838 errors:6 dropped:0 overruns:3 carrier:6 collisions:0 txqueuelen:100 Interrupt:11 Base address:0x3000 eth1:0 Link encap:Ethernet HWaddr 00:C0:95:E2:85:40 inet addr:192.168.2.13 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x3000 eth1:1 Link encap:Ethernet HWaddr 00:C0:95:E2:85:40 inet addr:192.168.2.14 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x3000 eth1:2 Link encap:Ethernet HWaddr 00:C0:95:E2:85:40 inet addr:192.168.2.10 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x3000 eth2 Link encap:Ethernet HWaddr 00:C0:95:E2:85:41 inet addr:192.168.3.21 Bcast:192.168.3.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:74336 errors:0 dropped:0 overruns:0 frame:0 TX packets:111705 errors:16 dropped:0 overruns:2 carrier:28 collisions:0 txqueuelen:100 Interrupt:10 Base address:0x3080 eth2:0 Link encap:Ethernet HWaddr 00:C0:95:E2:85:41 inet addr:192.168.3.13 Bcast:192.168.3.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:10 Base address:0x3080 eth2:1 Link encap:Ethernet HWaddr 00:C0:95:E2:85:41 inet addr:192.168.3.14 Bcast:192.168.3.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:10 Base address:0x3080 eth2:2 Link encap:Ethernet HWaddr 00:C0:95:E2:85:41 inet addr:192.168.3.10 Bcast:192.168.3.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:10 Base address:0x3080 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:3924 Metric:1 RX packets:191349 errors:0 dropped:0 overruns:0 frame:0 TX packets:191349 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 lo:0 Link encap:Local Loopback inet addr:64.211.224.162 Mask:255.255.255.240 UP LOOPBACK RUNNING MTU:3924 Metric:1 lo:1 Link encap:Local Loopback inet addr:64.211.224.163 Mask:255.255.255.240 UP LOOPBACK RUNNING MTU:3924 Metric:1 lo:2 Link encap:Local Loopback inet addr:64.211.224.166 Mask:255.255.255.240 UP LOOPBACK RUNNING MTU:3924 Metric:1 lo:3 Link encap:Local Loopback inet addr:64.211.224.168 Mask:255.255.255.240 UP LOOPBACK RUNNING MTU:3924 Metric:1 [root@xxx /root]# /sbin/route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 64.211.224.166 0.0.0.0 255.255.255.255 UH 0 0 0 lo 192.168.2.10 0.0.0.0 255.255.255.255 UH 0 0 0 eth1 192.168.2.13 0.0.0.0 255.255.255.255 UH 0 0 0 eth1 192.168.1.21 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 192.168.3.21 0.0.0.0 255.255.255.255 UH 0 0 0 eth2 64.211.224.162 0.0.0.0 255.255.255.255 UH 0 0 0 lo 64.211.224.163 0.0.0.0 255.255.255.255 UH 0 0 0 lo 192.168.2.14 0.0.0.0 255.255.255.255 UH 0 0 0 eth1 192.168.1.11 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 192.168.1.10 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 192.168.3.10 0.0.0.0 255.255.255.255 UH 0 0 0 eth2 192.168.1.13 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 192.168.3.13 0.0.0.0 255.255.255.255 UH 0 0 0 eth2 192.168.2.21 0.0.0.0 255.255.255.255 UH 0 0 0 eth1 192.168.1.12 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 64.211.224.168 0.0.0.0 255.255.255.255 UH 0 0 0 lo 192.168.1.14 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 192.168.3.14 0.0.0.0 255.255.255.255 UH 0 0 0 eth2 64.211.224.160 0.0.0.0 255.255.255.240 U 0 0 0 eth0 192.168.3.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2 192.168.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo [root@xxx /root]# cat /etc/sysctl.conf # Disables packet forwarding net.ipv4.ip_forward = 1 # Enables source route verification net.ipv4.conf.all.rp_filter = 1 # Disables automatic defragmentation (needed for masquerading, LVS) net.ipv4.ip_always_defrag = 0 # Disables the magic-sysrq key kernel.sysrq = 1 # -tcl. net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.eth0.send_redirects = 0 net.ipv4.conf.all.hidden = 1 net.ipv4.conf.lo.hidden = 1 # [root@xxx /root]# tail --lines 30 /etc/rc.d/rc.local # # -tcl. # # the whole static-routes / network scripts / lo:# / gateway being on a # different device than ips on the same network / bl ah blah lah sajdhsd. # totally flaky. let's just do it all here. # /sbin/sysctl -p /sbin/route add -net 64.211.224.160 netmask 255.255.255.240 dev eth0 #/sbin/route add default gw 64.211.224.161 dev eth0 ##/sbin/arp -s 64.211.224.161 00:30:B6:67:00:40 /sbin/arp -s 64.211.224.161 00:30:B6:67:00:AA #/sbin/ip rule add prio 100 from 192.168.1.0/24 table 100 #/sbin/ip route add table 100 0/0 via 192.168.1.1 dev eth0 /sbin/ifconfig lo:0 64.211.224.162 netmask 255.255.255.240 broadcast 64.211.224.175 up /sbin/route add -host 64.211.224.162 dev lo:0 /sbin/ifconfig lo:1 64.211.224.163 netmask 255.255.255.240 broadcast 64.211.224.175 up /sbin/route add -host 64.211.224.163 dev lo:1 /sbin/ifconfig lo:2 64.211.224.166 netmask 255.255.255.240 broadcast 64.211.224.175 up /sbin/route add -host 64.211.224.166 dev lo:2 /sbin/ifconfig lo:3 64.211.224.168 netmask 255.255.255.240 broadcast 64.211.224.175 up /sbin/route add -host 64.211.224.168 dev lo:3 #/sbin/ip rule add prio 33000 from 192.168.1.0/24 table 100 /sbin/ip route add table 100 0/0 via 192.168.1.1 dev eth0 #/sbin/ip rule add prio 34000 from 0/0 table 200 /sbin/ip route add table 200 0/0 via 64.211.224.161 dev eth0 /sbin/ip rule add prio 33000 from 64.211.224.160/28 table 200 /sbin/ip rule add prio 34000 from 0/0 table 100 # [root@xxx /root]# ip rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup 253 33000: from 64.211.224.160/28 lookup 200 34000: from all lookup 100 [root@xxx /root]# ip route 64.211.224.166 dev lo scope link src 64.211.224.166 192.168.2.10 dev eth1 scope link src 192.168.2.10 192.168.2.13 dev eth1 scope link src 192.168.2.13 192.168.1.21 dev eth0 scope link 192.168.3.21 dev eth2 scope link 64.211.224.162 dev lo scope link src 64.211.224.162 64.211.224.163 dev lo scope link src 64.211.224.163 192.168.2.14 dev eth1 scope link src 192.168.2.14 192.168.1.11 dev eth0 scope link src 192.168.1.11 192.168.1.10 dev eth0 scope link src 192.168.1.10 192.168.3.10 dev eth2 scope link src 192.168.3.10 192.168.1.13 dev eth0 scope link src 192.168.1.13 192.168.3.13 dev eth2 scope link src 192.168.3.13 192.168.2.21 dev eth1 scope link 192.168.1.12 dev eth0 scope link src 192.168.1.12 64.211.224.168 dev lo scope link src 64.211.224.168 192.168.1.14 dev eth0 scope link src 192.168.1.14 192.168.3.14 dev eth2 scope link src 192.168.3.14 64.211.224.160/28 dev eth0 scope link 192.168.3.0/24 dev eth2 proto kernel scope link src 192.168.3.21 192.168.2.0/24 dev eth1 proto kernel scope link src 192.168.2.21 192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.21 127.0.0.0/8 dev lo scope link [root@xxx /root]# ip route list table 100 default via 192.168.1.1 dev eth0 [root@xxx /root]# ip route list table 200 default via 64.211.224.161 dev eth0 [root@xxx /root]# the end result of this is that, well, for example, a nameservice query get directed through the lvs director to the phl real server, which answers it via direct routing. phl can also get to the outside world to deliver mail / make dns queries of its own via the masquerading. the policy routing says that traffic with a source ip of 64.211.224.160/28 gets sent via 64.211.224.161 (direct routing instead of nat/masq), whereas traffic with a source ip of anything else should go through 192.168.1.1 and be masqueraded. those 192.168.2 and .3 and whatever other networks on there can be ignored. /me breathes. ok. so all that has been working perfectly for months. the problem is that now i added a machine on 64.211.224.169 to do mail serving and stuff for our employees and some other stuff. for example, mail to @mybiz-inc.com gets delivered to 64.211.224.169, while mail to @mybiz.com gets directed to 64.211.224.166 (through the lvs director and to phl). the problem is that phl can't send traffic to 64.211.224.169 -- phl seems to think that 64.211.224.169 is on its loopback interface. 64.211.224.169 tries to make nameservice queries for 169.160-175.224.211.64.in-addr.arpa and *.mybiz.com to 64.211.224.162 and 64.211.224.163 (the auth nameservers for that -- phl handles them), but phl never responds. phl also tries to deliver mail to 64.211.224.169, but it can't send traffic there. check out: [root@xxx /root]# tcpdump -n host 64.211.224.169 and not port 53 & [1] 20668 User level filter, protocol ALL, datagram packet socket tcpdump: listening on all devices [root@xxx /root]# ping -n -c 5 64.211.224.169 PING 64.211.224.169 (64.211.224.169) from 64.211.224.169 : 56(84) bytes of data. 14:04:36.653475 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:36.653475 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:36.653506 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply 14:04:36.653506 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply 64 bytes from 64.211.224.169: icmp_seq=0 ttl=255 time=63 usec 14:04:37.649412 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:37.649412 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:37.649430 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply 14:04:37.649430 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply 64 bytes from 64.211.224.169: icmp_seq=1 ttl=255 time=34 usec 14:04:38.649446 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:38.649446 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:38.649462 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply 14:04:38.649462 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply 64 bytes from 64.211.224.169: icmp_seq=2 ttl=255 time=28 usec 14:04:39.649495 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:39.649495 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:39.649516 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply 14:04:39.649516 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply 64 bytes from 64.211.224.169: icmp_seq=3 ttl=255 time=37 usec 14:04:40.649527 lo > 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:40.649527 lo < 64.211.224.169 > 64.211.224.169: icmp: echo request 14:04:40.649545 lo > 64.211.224.169 > 64.211.224.169: icmp: echo reply 14:04:40.649545 lo < 64.211.224.169 > 64.211.224.169: icmp: echo reply 64 bytes from 64.211.224.169: icmp_seq=4 ttl=255 time=31 usec --- 64.211.224.169 ping statistics --- 5 packets transmitted, 5 packets received, 0% packet loss round-trip min/avg/max/mdev = 0.028/0.038/0.063/0.014 ms [root@xxx /root]# fg tcpdump -n host 64.211.224.169 and not port 53 158 packets received by filter [root@xxx /root]# when 169 tries to telnet to 166 port 25 (which gets directed to phl): [root@xxx /root]# tcpdump -n host 64.211.224.169 and not port 53 User level filter, protocol ALL, datagram packet socket tcpdump: listening on all devices 14:05:20.460200 eth0 B arp who-has 64.211.224.169 tell 64.211.224.162 14:05:50.883915 eth0 B arp who-has 64.211.224.166 tell 64.211.224.169 14:05:50.884155 eth0 < 64.211.224.169.1058 > 64.211.224.166.smtp: S 4151665104:4151665104(0) win 32120 <mss 1460,sackOK,timestamp 25658644 0,nop,wscale 0> (DF) 14:05:53.879424 eth0 < 64.211.224.169.1058 > 64.211.224.166.smtp: S 4151665104:4151665104(0) win 32120 <mss 1460,sackOK,timestamp 25658944 0,nop,wscale 0> (DF) 725 packets received by filter no response is ever sent. when phl tries to send mail to mybiz-inc.com: [root@xxx /root]# dnsmx mybiz-inc.com 0 mail.mybiz-inc.com [root@xxx /root]# dnsip mail.mybiz-inc.com 64.211.224.169 [root@xxx /root]# telnet 64.211.224.169 25 Trying 64.211.224.169... Connected to inc.mybiz.com (64.211.224.169). Escape character is '^]'. 220 phl.usa.mybiz ESMTP ^]q Connection closed. [root@xxx /root]# 14:07:39.001323 lo > 64.211.224.169.1549 > 64.211.224.169.smtp: S 4291120419:4291120419(0) win 31072 <mss 3884,sackOK,timestamp 441773751 0,nop,wscale 0> (DF) 14:07:39.001323 lo < 64.211.224.169.1549 > 64.211.224.169.smtp: S 4291120419:4291120419(0) win 31072 <mss 3884,sackOK,timestamp 441773751 0,nop,wscale 0> (DF) 14:07:39.001367 lo > 64.211.224.169.smtp > 64.211.224.169.1549: S 200723:200723(0) ack 4291120420 win 31072 <mss 3884,sackOK,timestamp 441773751 441773751,nop,wscale 0> (DF) 14:07:39.001367 lo < 64.211.224.169.smtp > 64.211.224.169.1549: S 200723:200723(0) ack 4291120420 win 31072 <mss 3884,sackOK,timestamp 441773751 441773751,nop,wscale 0> (DF) 14:07:39.001390 lo > 64.211.224.169.1549 > 64.211.224.169.smtp: . 1:1(0) ack 1 win 31072 <nop,nop,timestamp 441773751 441773751> (DF) 14:07:39.001390 lo < 64.211.224.169.1549 > 64.211.224.169.smtp: . 1:1(0) ack 1 win 31072 <nop,nop,timestamp 441773751 441773751> (DF) 14:07:39.007531 lo > 64.211.224.169.smtp > 64.211.224.169.1549: P 1:26(25) ack 1 win 31072 <nop,nop,timestamp 441773752 441773751> (DF) 14:07:39.007531 lo < 64.211.224.169.smtp > 64.211.224.169.1549: P 1:26(25) ack 1 win 31072 <nop,nop,timestamp 441773752 441773751> (DF) 14:07:39.007570 lo > 64.211.224.169.1549 > 64.211.224.169.smtp: . 1:1(0) ack 26 win 31047 <nop,nop,timestamp 441773752 441773752> (DF) 14:07:39.007570 lo < 64.211.224.169.1549 > 64.211.224.169.smtp: . 1:1(0) ack 26 win 31047 <nop,nop,timestamp 441773752 441773752> (DF) Connected to inc.mybiz.com (64.211.224.169). it tries to send to itself. does anyone have any idea why phl would think 64.211.224.169 is on its lo? it seems to think that for all of 64.211.224.160/28. if i telnet to port 25 on any ip in that range, phl directs the request to itself on lo just like 169. anyone even understand this? heh. i'm seriously confused myself. i'd love to hear any ideas. -tcl.