Re: ARP flux vs. weak/strong ES model

Erik Auerswald <auerswal@xxxxxxxxxxxxxxxxx> · Mon, 18 Feb 2019 01:37:29 +0100

Ji,

On 2/17/19 06:10, Grant Taylor wrote:
On 2/15/19 1:11 PM, Erik Auerswald wrote:
[...]
The earliest reference I could find to "ARP flux" was in the Guide to IP 
Layer Network Administration document, with references going back to 2003.

I get the impression that the "ARP flux" nomenclature may have 
originated from the Linux community and grown out from there.

Perhaps because it is a Linux only thing? That is just a guess.

[...]
I'd say the distinction between weak and strong end-system (ES) model 
is related to the problem at hand, but "ARP flux" is a distinct issue 
not necessarily observed with a host implementing the weak (ES) model.

Do you know of any examples where the "ARP flux" /symptom/ was observed 
on a "strong end system"?

I have only ever seen it on Linux.

RFC 1122 continues to state that "[The weak ES] model [...] is 
necessary for hosts that have embedded gateway functionality."

I disagree with that assertion.  (Not your recounting of it.)

I'd say that assertion is often correct, but need not be.

I think that this assertion motivates looking at non-Linux systems, 
especially traditional routers, and if they act as weak or strong 
end-systems. And then look at their ARP handling, excluding Proxy ARP.

A gateway (or router) needs to accept IP packets not addressed to the 
receiving interface in order to forward them. But most commercial 
routers will not answer an ARP request ingressing on the wrong 
interface, unless Proxy-ARP is activated. By default, Linux does 
answer such ARP requests.

I believe the MUST vs MUST NOT text is primarily about traffic to the 
end system, not traffic passing through the system.  Particularly how 
the system's IP stack responds in relation to IPs bound locally.

I think that routing / forwarding is decidedly different.  Yes, a router 
must allow traffic in that is not destined to the router.  (Assuming 
that we're talking about a forwarding interface and not a management 
interface.)  That's the router's job.

Yes, a system dedicated to packet forwarding is different. Often it is 
an end-system as well for management purposes, sometimes for additional 
services (inlcuding BGP). RFC 1122 looks at IP hosts, not necessarily 
dedicated gateways, but includes hosts that act as gateways, too. Thus I 
do not say that RFC 1122 is necessarily pertinent to dedicated routers, 
but often even dedicated routers act like end-systems with embedded 
gateway functionality.

But the router does not need to respond to ARP requests from one 
interface for IPs on a different interface.

Exactly. Unless Proxy ARP is active, any "traditional" router I know 
answers ARP requests only if they arrive at the interface configured 
with the IP address in the ARP request.

Again, any "traditional" router accepts IP packets directed to any of 
its interface IPs irrespective of the ingress interface. That is the 
basis for using a loopback address for router management or BGP 
sessions. In that case a router acts as an end-system as well.

The above can often be changed via configuration, to separate management 
interfaces from forwarding interfaces.

This answer to ARP requests arriving at the wrong interface is the 
root cause for most problems commonly subsumed under "Linux implements 
the weak ES model". But those problems do not occur with other weak ES 
model implementing systems, e.g. Extreme Networks EXOS switches.

Each IP stack is different.  I wouldn't take the difference in behavior 
of the Linux IP stack and EXOS's IP stack as definitive behavior.

But the weak vs. strong ES model is independent of IP stacks. It is a 
description of behaviour, and can be used as guidance when implementing 
an IP stack.

The two examples are just that, examples, nothing more. Cisco IOS (with 
deactivated Proxy ARP) is another example for weak ES model without ARP 
flux. But just one example (e.g. of Extreme or Cisco) suffices to show 
that weak ES model does not imply ARP flux. The Linux example shows that 
weak ES model and ARP flux can occur together.

Additionally, changing Linux behaviour (via sysctl) to not answer 
those ARP requests does not change the weak ES model applicability to 
Linux (i.e. IP packets delivered inside an Ethernet frame addressed to 
an interface MAC of a Linux system are accepted even if they are 
addressed to another IP address of the Linux host; packet forwarding 
is not affected either).

Linux's cavalier default behaviour in answering ARP requests might be 
motivated by the weak ES model in helping other systems on the LAN 
reach the Linux server, even if they use an IP address assigned to 
another LAN the Linux server is connected to. Thus the problem of "ARP 
flux" is probably closely related to how Linux implements the weak ES 
model, but not necessarily to the weak ES model itself as described in 
RFC 1122.

I feel like "ARP flux" is a /symptom/ of "weak ES model".

I'd say it is somewhat independent of the weak ES model. It is a symptom 
of the Linux IP stack. That IP stack may be built around weak ES model 
ideas. Other IP stacks adhere to the weak ES model as well without 
exhibiting ARP flux. Thus I do not accept that "ARP flux" is a symptom 
(or necessary result) of "weak ES model".

But it seems to me that combining strong ES model with ARP flux does not 
make sense. As such I do see some relation between ARP flux and weak ES 
model.

[...]
Thus I argue that using "ARP flux" to describe the ARP problem 
observed with Linux is preferable to attributing the problems to 
Linux's implementation of the weak ES model.

You logic makes sense.  I choose to think something different.

Then we have to agree to disagree.

Thanks,
Erik

Re: ARP flux vs. weak/strong ES model

Linux Advanced Routing and Traffic Control