Re: Odd result and underlying mistake

cronolog+lartc <cronolog+lartc@xxxxxxxxxxxxxx> · Tue, 20 Mar 2018 02:29:30 +0000

On 2018-03-17 01:36, Grant Taylor wrote:
On 03/16/2018 02:54 PM, Leroy Tennison wrote:
Posting for other's benefit in case someone else does this.  I 
searched the web without finding an answer then discovered the 
issue.  What I saw in a tcpdump output (because things weren't 
working) was

Request who-has <target IP address> tell <target IP address>

Where <target IP address> was a local interface address, quite odd 
since the local interface should know its own MAC address.

That sounds like a Gratuitous ARP.
I agree this is Gratuitous ARP generating this.  It's used to do things 
like IP address conflict detection, and flushing stale ARP caches on 
link-local neighbours, and is quite normal to see.

The problem was that I had accidentally used the local interface IP 
address in 'ip route add default via <local interface IP address> dev 
<local interface>' instead of 'ip route add default via <gateway IP 
address accessible from local interface> dev <interface>'.

I think I just reproduced this in a network namespace.

When I do this, I don't see "Request who-has <target IP address> tell 
<target IP address>".  Instead I see "Request who-has <target IP 
address> tell <NetNS IP address>".

19:26:13.919415 ARP, Request who-has 8.8.8.8 tell 192.0.2.1, length 28
19:26:14.943348 ARP, Request who-has 8.8.8.8 tell 192.0.2.1, length 28
19:26:15.967318 ARP, Request who-has 8.8.8.8 tell 192.0.2.1, length 28
19:26:16.991390 ARP, Request who-has 8.8.8.8 tell 192.0.2.1, length 28
19:26:18.015337 ARP, Request who-has 8.8.8.8 tell 192.0.2.1, length 28

This in and of itself seems odd to me.  Why is Linux ARPing for an 
address that is obviously not local to the subnet?  (I bound 
192.0.2.1/24, Test-Net-1, to the interface in the NetNS.)
Because next-hop has been set to itself, or more specifically, which 
source interface to use for next-hop with invalid next-hop.  So Linux 
will ARP for anything going out via that source interface's link as if 
it is local connected, and expects a Proxy ARP-enabled device to route 
the packet to the correct destination.  Cisco routers generally have 
Proxy ARP enabled by default, you can also enable it on a Linux router with:

echo 1 > /proc/sys/net/ipv4/conf/{interface}/proxy_arp

Once I bound 8.8.8.8/32 to the vEth in my main NetNS [1] I saw an ARP 
reply.  But pings to 8.8.8.8/32 timed out.

19:28:22.651995 ARP, Request who-has 8.8.8.8 tell 192.0.2.1, length 28
19:28:22.652010 ARP, Reply 8.8.8.8 is-at ca:b0:eb:fa:ef:ab, length 28
19:28:22.652013 IP 192.0.2.1 > 8.8.8.8: ICMP echo request, id 11202, 
seq 1, length 64
19:28:23.711384 IP 192.0.2.1 > 8.8.8.8: ICMP echo request, id 11202, 
seq 2, length 64
19:28:24.735382 IP 192.0.2.1 > 8.8.8.8: ICMP echo request, id 11202, 
seq 3, length 64
19:28:25.759387 IP 192.0.2.1 > 8.8.8.8: ICMP echo request, id 11202, 
seq 4, length 64
Probably the return route for the ping reply was missing or incorrect in 
the main netNS at this point hence no reply seen, though from below it 
seems you managed to work this bit out.

When I checked routing on my main NetNS, I found that 192.0.2.0/24 was 
going out my default gateway.  [2]

So I added a route for 192.0.2.1/32 to go out the vEth device that had 
8.8.8.8/32 bound to it.  (But now "via <IP>", just "dev <device>".

ip route add 192.0.2.0/24 dev n1

After doing that, I'm actually able to ping 8.8.8.8 from within the 
network namespace.  IMHO this shouldn't be possible as it's only got a 
route to 192.0.2.0/24.
As previously, because you still have a default route (but set to a 
link), it will attempt to route via that link.  Proxy ARP in the main 
netNS would normally take care of sorting out layer-2, but since you 
bound 8.8.8.8/32 directly to the main netNS interface, the main netNS 
could reply to the ARP request without Proxy ARP enabled.  Then once you 
added the return path with the above method, a similar process occurs 
for the reply packet, hence completing the loop and allowing you to ping 
and get a reply.

Note that you're only pinging your local 8.8.8.8 from the other netNS, 
not the real 8.8.8.8 on the Internet.  To get a better understanding of 
Proxy ARP, try doing this without binding 8.8.8.8, and enable Proxy ARP, 
routing, and NAT in the main netNS.  You should find with the 3 of those 
things together, you get full Internet access from the other netNS even 
though it doesn't have a proper default gateway address set.

1) I actually don't know what the main / default routing namespace 
equivalent is.  As far as I can tell, there's no term for it.  At 
least not that I've found.
2) What's surprising by this is that I frequently have 192.0.2.0/24 
bound to a dummy interface on my machine 
1) I'm not actually sure either, I've seen it commonly referred to as 
the default namespace though
2) When you bound 8.8.8.8 to the interface, did you add it as an 
additional IP address or did it replace the 192.0.2.0/24 IP address? The 
way you bound 8.8.8.8 would affect the behaviour.  Or if this was a 
different interface, was the dummy interface with 192.0.2.0/24 up or 
down during testing?  Again, the interface state can affect routing 
behaviour.

--
To unsubscribe from this list: send the line "unsubscribe lartc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Odd result and underlying mistake

Linux Advanced Routing and Traffic Control