In article <47572D0D.4050107@xxxxxxxx>, Alan Bunch <Alan.Bunch@xxxxxxxx> wrote: > Please bear with me as I know I have included a lot of detail. > > Description > Redhat AS 3 > Kernel 2.4.21-47.0.1.ELsmp > eth0 HWaddr 00:07:E9:11:30:76 > inet addr:192.168.1.3 Bcast:192.168.1.255 Mask:255.255.255.0 > > eth0:1 HWaddr 00:07:E9:11:30:76 > inet addr:192.168.3.1 Bcast:192.168.3.255 Mask:255.255.255.0 > > This interface is up but with no IP address > eth1 HWaddr 00:07:E9:11:30:77 > > eth1.101 10.10.1.1 Bcast:10.10.1.255 Mask:255.255.255.0 vlan 101 > eth1.102 10.10.2.1 Bcast:10.10.2.255 Mask:255.255.255.0 vlan 102 > eth1.103 10.10.3.1 Bcast:10.10.3.255 Mask:255.255.255.0 vlan 103 > eth1.104 10.10.4.1 Bcast:10.10.4.255 Mask:255.255.255.0 vlan 104 > eth1.105 10.10.5.1 Bcast:10.10.5.255 Mask:255.255.255.0 vlan 105 > eth1.106 10.10.6.1 Bcast:10.10.6.255 Mask:255.255.255.0 vlan 106 > eth1.107 10.10.7.1 Bcast:10.10.7.255 Mask:255.255.255.0 vlan 107 > eth1.108 10.10.8.1 Bcast:10.10.8.255 Mask:255.255.255.0 vlan 108 > eth1.109 10.10.9.1 Bcast:10.10.9.255 Mask:255.255.255.0 vlan 109 > eth2 Link encap:Ethernet HWaddr 00:06:5B:FE:56:C2 > BROADCAST MULTICAST MTU:1500 Metric:1 > eth3 Link encap:Ethernet HWaddr 00:06:5B:FE:56:C3 > BROADCAST MULTICAST MTU:1500 Metric:1 I assume the above machine is "the router". I don't know whether leaving eth1 without an IP address could be the source of any problems... > This machine is routing between the vlans. When I ping 10.10.5.105 I > get Host Unreachable. Presumably, 10.10.5.105 is what you refer to below as "the device". > Here is tcpdump from the ping on the router. > > tcpdump -i eth1.105 > 14:10:28.254008 arp who-has 10.10.5.105 tell 10.10.5.1 > 14:10:29.250067 arp who-has 10.10.5.105 tell 10.10.5.1 > 14:10:30.250143 arp who-has 10.10.5.105 tell 10.10.5.1 This suggests either: a) the arp request is not being heard/understood by the device, or b) the arp reply is not being heard by the router, or even perhaps c) the lack of -n is causing tcpdump not to display some packets while it is trying to resolve addresses to hostnames. Try again using this: "tcpdump -n -e -i any" - this will include the ethernet address and will monitor all interfaces instead of just 105 (in case a packet is going to the wrong interface). Also, what is the routing table shown by "netstat -rn"? > Ok now I go to the device via a serial port and ping back to 10.10.5.1 ( > the router ) and here is the tcpdump output > > tcpdump -i eth1.105 -n > tcpdump: listening on eth1.105 > 14:12:06.706722 arp who-has 10.10.5.1 tell 10.10.5.105 > 14:12:06.706798 arp reply 10.10.5.1 is-at 0:7:e9:11:30:77 > 14:12:06.707715 10.10.5.105 > 10.10.5.1: icmp: echo request (DF) > 14:12:06.707762 10.10.5.1 > 10.10.5.105: icmp: echo reply > 14:12:07.723100 10.10.5.105 > 10.10.5.1: icmp: echo request (DF) > 14:12:07.723136 10.10.5.1 > 10.10.5.105: icmp: echo reply OK... > Now of course I can ping 10.10.5.10 (the suspect device) from 10.10.5.1 > ( the router ) Is 10.10.5.10 a typo for 10.10.5.105? What kind of unit is the "suspect device"? Can you display "ifconfig -a" and "netstat -rn" or the equivalent on it? > ping 10.10.5.1 > PING 10.10.5.1 (10.10.5.1) 56(84) bytes of data. > 64 bytes from 10.10.5.1: icmp_seq=0 ttl=64 time=0.068 ms > 64 bytes from 10.10.5.1: icmp_seq=1 ttl=64 time=0.039 ms But this appears to be pinging TO the router, not pinging FROM the router! > This is fine untill the arp entry ages out. Then I back to not being > able to ping the device. > > If I manually insert an arp table entry all is well. No filtering in > the switches. Switches are SMC 6826 for the 10/100 and 8724 for the > core and gig e. > > I have several similar symptoms like this in various places. I have > some devices on vlans that I see the dhcp discover messages and I see > the dhcp offer then the device sends another dhcp discover. This goes > on for a few times and the device just waits and starts the process over. > > I feel that the problem lies in the handling of arp requests but I guess > I just don't know enough about how linux handles them or how to control > them to find a solution. > > Any ideas ? Sounds more like a general broadcast issue to me, since DHCP discovers and offers are sent as broadcasts and therefore don't need ARP first. If you use -e in tcpdump you will see whether the broadcast ethernet address is being used (ff:ff:ff:ff:ff:ff) or not. If ARP and DHCP packets are not using the broadcast ethernet address, then something is not right with the netmask or the broadcast address. Intersting problem - let us know how you get on. Cheers Tony -- Tony Mountifield Work: tony@xxxxxxxxxxxxx - http://www.softins.co.uk Play: tony@xxxxxxxxxxxxxxx - http://tony.mountifield.org _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos