Re: Arp problems I think ...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



In article <47572D0D.4050107@xxxxxxxx>, Alan Bunch <Alan.Bunch@xxxxxxxx> wrote:
> Please bear with me as I know I have included a lot of detail.
> 
> Description
> Redhat AS 3
> Kernel 2.4.21-47.0.1.ELsmp
> eth0      HWaddr 00:07:E9:11:30:76
>           inet addr:192.168.1.3  Bcast:192.168.1.255  Mask:255.255.255.0
> 
> eth0:1    HWaddr 00:07:E9:11:30:76
>           inet addr:192.168.3.1  Bcast:192.168.3.255  Mask:255.255.255.0
>           
> This interface is up but with no IP address
> eth1      HWaddr 00:07:E9:11:30:77
> 
> eth1.101  10.10.1.1  Bcast:10.10.1.255  Mask:255.255.255.0  vlan 101
> eth1.102  10.10.2.1  Bcast:10.10.2.255  Mask:255.255.255.0  vlan 102
> eth1.103  10.10.3.1  Bcast:10.10.3.255  Mask:255.255.255.0  vlan 103
> eth1.104  10.10.4.1  Bcast:10.10.4.255  Mask:255.255.255.0  vlan 104
> eth1.105  10.10.5.1  Bcast:10.10.5.255  Mask:255.255.255.0  vlan 105
> eth1.106  10.10.6.1  Bcast:10.10.6.255  Mask:255.255.255.0  vlan 106
> eth1.107  10.10.7.1  Bcast:10.10.7.255  Mask:255.255.255.0  vlan 107
> eth1.108  10.10.8.1  Bcast:10.10.8.255  Mask:255.255.255.0  vlan 108
> eth1.109  10.10.9.1  Bcast:10.10.9.255  Mask:255.255.255.0  vlan 109
> eth2      Link encap:Ethernet  HWaddr 00:06:5B:FE:56:C2
>           BROADCAST MULTICAST  MTU:1500  Metric:1      
> eth3      Link encap:Ethernet  HWaddr 00:06:5B:FE:56:C3
>           BROADCAST MULTICAST  MTU:1500  Metric:1

I assume the above machine is "the router". I don't know whether leaving
eth1 without an IP address could be the source of any problems...

> This machine is routing between the vlans.  When I ping 10.10.5.105 I 
> get Host Unreachable.

Presumably, 10.10.5.105 is what you refer to below as "the device".

> Here is tcpdump from the ping on the router.
> 
> tcpdump -i eth1.105
> 14:10:28.254008 arp who-has 10.10.5.105 tell 10.10.5.1
> 14:10:29.250067 arp who-has 10.10.5.105 tell 10.10.5.1
> 14:10:30.250143 arp who-has 10.10.5.105 tell 10.10.5.1

This suggests either:
a) the arp request is not being heard/understood by the device, or
b) the arp reply is not being heard by the router, or even perhaps
c) the lack of -n is causing tcpdump not to display some packets
   while it is trying to resolve addresses to hostnames.

Try again using this: "tcpdump -n -e -i any" - this will include the
ethernet address and will monitor all interfaces instead of just 105
(in case a packet is going to the wrong interface).

Also, what is the routing table shown by "netstat -rn"?

> Ok now I go to the device via a serial port and ping back to 10.10.5.1 ( 
> the router ) and here is the tcpdump output
> 
> tcpdump -i eth1.105 -n
> tcpdump: listening on eth1.105
> 14:12:06.706722 arp who-has 10.10.5.1 tell 10.10.5.105
> 14:12:06.706798 arp reply 10.10.5.1 is-at 0:7:e9:11:30:77
> 14:12:06.707715 10.10.5.105 > 10.10.5.1: icmp: echo request (DF)
> 14:12:06.707762 10.10.5.1 > 10.10.5.105: icmp: echo reply
> 14:12:07.723100 10.10.5.105 > 10.10.5.1: icmp: echo request (DF)
> 14:12:07.723136 10.10.5.1 > 10.10.5.105: icmp: echo reply

OK...

> Now of course I can ping 10.10.5.10 (the suspect device) from 10.10.5.1 
> ( the router )

Is 10.10.5.10 a typo for 10.10.5.105?

What kind of unit is the "suspect device"? Can you display "ifconfig -a"
and "netstat -rn" or the equivalent on it?

> ping 10.10.5.1
> PING 10.10.5.1 (10.10.5.1) 56(84) bytes of data.
> 64 bytes from 10.10.5.1: icmp_seq=0 ttl=64 time=0.068 ms
> 64 bytes from 10.10.5.1: icmp_seq=1 ttl=64 time=0.039 ms         

But this appears to be pinging TO the router, not pinging FROM the router!

> This is fine untill the arp entry ages out.  Then I back to not being 
> able to ping the device.
> 
> If I manually insert an arp table entry all is well.  No filtering in 
> the switches.  Switches are SMC 6826 for the 10/100 and 8724 for the 
> core and gig e.
> 
> I have several similar symptoms like this in various places.  I have 
> some devices on vlans that I see the dhcp discover messages and I see 
> the dhcp offer then the device sends another dhcp discover.  This goes 
> on for a few times and the device just waits and starts the process over.
> 
> I feel that the problem lies in the handling of arp requests but I guess 
> I just don't know enough about how linux handles them or how to control 
> them to find a solution.
> 
> Any ideas ?

Sounds more like a general broadcast issue to me, since DHCP discovers
and offers are sent as broadcasts and therefore don't need ARP first.
If you use -e in tcpdump you will see whether the broadcast ethernet
address is being used (ff:ff:ff:ff:ff:ff) or not. If ARP and DHCP packets
are not using the broadcast ethernet address, then something is not right
with the netmask or the broadcast address.

Intersting problem - let us know how you get on.

Cheers
Tony
-- 
Tony Mountifield
Work: tony@xxxxxxxxxxxxx - http://www.softins.co.uk
Play: tony@xxxxxxxxxxxxxxx - http://tony.mountifield.org
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux