Hi,
On 2/18/19 03:18, Grant Taylor wrote:
On 2/17/19 5:37 PM, Erik Auerswald wrote:
I have only ever seen it on Linux.
Likewise.
I think that this assertion motivates looking at non-Linux systems,
especially traditional routers, and if they act as weak or strong
end-systems. And then look at their ARP handling, excluding Proxy ARP.
Before looking at other systems, I'd want to step back and think how
weak vs strong end-systems /should/ behave regarding ARP flux.
As I see it, strong ES model should preclude ARP flux, because an ES
first telling "use this MAC to send to my IP X", but then discarding the
packet sent based on that information, seems nonsensical.
In the weak ES model, telling another ES to send to MAC Y to deliver to
IP X, although the interface with MAC Y uses IP Z≠X seems OK, but not
mandatory. To me this feels much like the use of Proxy ARP to work
around misconfigured end-systems. Ignoring ARP requests on all
interfaces but the one with the asked for IP seems OK, too.
Aside: I think of Proxy ARP as a form of routing.
Freely combining MAC and IP addresses in ARP replies looks quite similar
to Proxy ARP to me.
After all, the Proxy
ARP router, is doing exactly what it would do to route a packet if it
had naturally come to it / it's MAC address. All Proxy ARP is doing is
responding to ARP requests for things on the other side of the router
such that the packet does come to the router.
That is correct, but Proxy ARP will result in a router answering an ARP
request that was not for the receiving interface, thus it could be
confused with ARP flux in a test.
[...]
Again, any "traditional" router accepts IP packets directed to any of
its interface IPs irrespective of the ingress interface. That is the
basis for using a loopback address for router management or BGP
sessions. In that case a router acts as an end-system as well.
That's not entirely true. Especially when filtering / firewalling is in
place to only allow traffic from specific interfaces.
Filtering / firewalling can be in effect in Linux as well, including for
Ethernet (ebtables). That would most likely affect ARP flux and weak ES
model behaviour as well, depending on rule set.
I've also viewed that as the traffic would be routed through the device
to the proper interface which would then process it accordingly. In,
over / through and then up the IP stack instead of in and directly up
the IP stack.
It is not observably "routed" in that all routing actions (L2 rewrite,
TTL decrement) usually happen when sending the packet _out_ the egress
interface.
[...]
Let's back up and discuss what is actually allowing ARP flux to happen.
As I understand it, the /flux/ comes from the fact that the MAC address
that ARPing hosts get replies from changes and fluctuates. Hence the name.
It's my understanding that this happens because Linux does not filter
(in any meaning of the word) incoming ARP requests (or outgoing replies)
based on the physical interface. This is especially true when you have
multiple interfaces in the same broadcast domain.
That is a logical explanation of the name "ARP flux".
Aside: The last time I tried to put two interfaces in the same subnet
and connect them to the same broadcast domain on a Cisco, it would not
allow me to do so.
Correct. Huawei VRP allows a special case (a loopback interface with a
/32 IP inside an IP subnet with shorter prefix active on another
interface, including Ethernet interfaces), but I have not yet found time
to thoroughly test the behaviour of that. Other networking equipment I
used did not allow two interfaces in the same (or overlapping) subnet(s).
I'd say it is somewhat independent of the weak ES model. It is a
symptom of the Linux IP stack. That IP stack may be built around weak
ES model ideas. Other IP stacks adhere to the weak ES model as well
without exhibiting ARP flux.
Sorry for being pedantic, but I think we need to clearly define the
configuration and behavior that we're discussing.
I say this because I think that "ARP flux" is a symptom of having a
Linux box with two interfaces in the same broadcast domain, thus able to
hear the same ARP request and that the flux comes from the ensuing race
conditions as to which interface will be processed -and- reply first.
I feel like this same scenario is seldom played out in traditional
network gear. And if we want to have the discussion about this, we
should configure said gear comparably and test how it behaves.
I will also state that Linux may likely respond to ARP requests from an
inside interface for IPs on the outside interface. But in such a
scenario, there is only one interface connected to the broadcast domain,
thus there is nothing to flux over as it will always be the single
possible interface.
So, let's define what the connections are, and how things are configured.
I'm stating two interfaces connected to the same broadcast domain, each
with IPs in different subnets. (Thus the broadcast domain is overloaded
and has multiple subnets on it.) I think there is a reasonable chance
that the ARP flux symptoms can occur in this configuration.
I'm thinking Linux /kernel/ default (no distro sysctl modifications or
kernel compilation tweaks). I'm also thinking Proxy ARP is disabled.
Do you agree? Or do you want to alter the configuration?
I want to extend that scenario to include two interfaces A and B with
different IP addresses connected to two separate broadcast domains.
While that does not result in fluctuating ARP replies (ARP flux), it
does result in ARP replies combining MAC of interface A with IP of
interface B.
Both scenarios (two interfaces connected to one broadcast domain, two
interfaces connected to separate broadcast domains) show symptoms of the
same underlying cause. The name ARP _flux_ is more fitting to the first
scenario, the second could be better described as "ARP confusion" (I
made that name up just now).
[...]
That being said, you do have me questioning things. At the moment, I'm
sticking with what I've thought for years. But I am interested in
continuing the conversation and learning, what ever the lesson may be.
Likewise.
Best regards,
Erik