Hello, Daniel Gryniewicz wrote: > Let me explain my setup, and then you can hopefully explain why I'm > wrong. I work for a company that writes routing software. We support > and test on a mixture of Unix OSs, one being Linux, and RTOSs. Here's > an example setup: > > ---+----+--- (192.168.1/24) > | | > A B > | | > ---+----+--- (10.10/16) > > A: 192.168.1.1, 10.10.0.1 > B: 192.168.1.2, 10.10.0.2 > > A and B usually have default routes pointed to 192.168.1.254, as this is > the router connecting the testbed to the workstations. In this example, > A is Linux, B is any other OS we test against ([Free|Net|Open]BSD, BSDi, > Solaris, AIX, True64). Both are running ISIS, which does *not* use IP > as a transport (this is important, I'll explain later). ISIS installs a > new default route on A that points at 10.10.0.2. If, at this point, > there is no ARP entry in A's ARP cache for 10.10.0.2, it will send an > ARP reqest to the 10.10/16 network as follows: > whohas 10.10.0.2 tell 192.168.1.1 > This happens every time in this situation. If we were running, say, > OSPF (which does use IP as a transport), then A's ARP cache would always > have an entry for 10.10.0.2, and this situation would not show up. > However, we have confirmed that, if you clear the ARP cache on A, and > then change it's default route from one network to another, it will > always generate these bizarre ARP requests. Since all the other OSs we Yes, such behaviour can be observed also when cross-subnet talks occur. > run will not answer that request, this results in A being completely > unreachable from off of it's directly connected networks. Those ARP > reqests are generated even if the packets being sent are pings to > 10.10.0.2. To me, this seems broken, and has resulted in us not > recommending Linux to any of our customers. (The company is historically No, I don't prefer the word "broken". Note that the two ways to answer (the Linux's way and the any other way) are valid for specific setup. What we have is a limited behaviour. We are not trying to fix bugs or weirdness in other OSes but look: why these "other" hosts do not work with such ARP packets? This is a real bug or at least wrong routing. Is it allowed IP packet with saddr=192.168.1.1 and daddr=UNIVERSE to reach the default gateway? Yes. Then why ARP is not answered? Because the GW does not have valid host routes for 192.168.1.1 and 192.168.1.2. Are the routes to these IPs through gateway and if yes, then which is this gateway? You need valid host routes. At least, this is in Linux. If you investigate additionally, you will see that you need alternative routes for these hosts because you can reach the both hosts through 2 devices. I.e. such complex setup does not work by magic as you expect. > largely a BSD house. I'm trying to change this bias somewhat). 2.4.16 > does this, as well as 2.2.*. I've gotten a patch for 2.2.19 that fixes > the problem, but all I've heard about 2.4 is "That's how it's supposed > to work". If this is not broken, how do I get it to work? As we see there are different setups that need different ARP behaviour. This is the reason that I use the word "tuning" for the ARP traffic/behaviour. If you want the behaviour not to announce one IP through multiple devices (we assume it does not cause other problems for your setup) then you can use such iparp command: ip arp add output from 0/0 src 0 This will cause all ARP requests to announce the preferred source IP to the target (what you really want). You can additionally tune the behaviour by specifying subnets, interfaces, etc. if you have problems as mentioned in my previous posting. See iparp.txt for syntax. > Daniel Regards -- Julian Anastasov <ja@ssi.bg> - : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html