Re: [Lf_carrier] [CGL 5.0] [CAF.2.1] [Enea Linux] Ethernet MAC address takeover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello guys,

Sorry for the late response. Hope I will be able to offer a correct anwer, maybe Joe will also e able to provide his input here but here is what I thing of this:

In large networks it is disadvantageous to modify IP address – Media Access Control (MAC) pairs, because there could be certain routers  which do not refresh their arp cache. This could cause problems in the network traffic, the fail-over functionality is realized by taking over the Media Access Control (MAC) address.

In such systems, all nodes use the same fix IP and hardware MAC address in the network and the nodes are differentiated by the state of the servicing interface. The master (active) node has the interface in up state while the slave nodes' interfaces are kept down. If the service is failed over to the other node, the interfaces get into up state. Client requests are serviced by the node having the interface in up state.

Transferring MAC address is beneficial if the resources need to be relocated very quickly, but make sure that having m
ultiple interfaces with the same IP or MAC address connecting to a network can destabilize the network. This makes it highly important to monitor the takeover process and to completely remove ( for example keep powered off) the failed server from the network. You could take a closer look at a STONITH device (heartbeat is a good starting point).

Useful links:



Hope I was able to help you :D


Regards,
Alex V.



On Monday, October 5, 2015 4:31 PM, Stefan Sicleru <Stefan.Sicleru@xxxxxxxx> wrote:


Hello,
 
We (at Enea) are working towards a CGL 5.0 compliant distribution and we have some questions regarding
the requirement specified within the subject.
 
The MAC address takeover requirement sounds like this:
 
--
CGL specifies a mechanism to program and announce MAC addresses on Ethernet 
interfaces so that when a SW Failure event occurs, redundant nodes may begin 
receiving traffic for failed nodes. 
--
 
We’ve accomplished CAF.2.2 requirement (which is the IP address takeover scenario) and we ran into
some issues regarding CAF.2.1. For the IP scenario we have deployed a Pacemaker+Corosync setup
and everything behaved as expected. However, I have not been able to use the same tools for the
Ethernet takeover scenario. To the best of my knowledge, the closest thing Pacemaker offers is to
configure a load-balancing scheme that involves a cluster of nodes answering to the same IP and MAC
address in a round robin fashion. But this is not about having fail-over mechanism for the unicast MAC
addresses (as the CGL requirement specifies), but rather a fail-over mechanism of resources assigned
to multiple machines that share the same multicast MAC address.
 
Since one request reaches all nodes within the cluster (through the shared multicast MAC), Pacemaker
uses iptables rules on the nodes so that any given  packet will be grabbed by exactly one node (through
a hashing policy). This gives us a form of load-balancing. The cluster can be instructed to clone resources
in case of a failure, hence we can achieve a form of a fail-over capability. But then again, this is rather
different from the CGL requirement w.r.t unicast MAC address takeover.
 
Moreover, if we look over the code of “IPaddr2” Resource Agent, we see that the MAC string (provided as
parameter) is only used for “--clustermac” value of the iptables CLUSTERIP target. There is no other use
for the MAC string provided by IPaddr2. I have not find any resource agent with Ethernet address cloning
capabilities.
 
I would like to know if the scenario described above is relevant for the requirement. Or should we try
to offer the same fail-over mechanism as we did for the IP takeover scenario? Should we try cloning
the unicast MAC address of the failed interface by using other means? If so, can you give us pointers
to some tools that may be used within a clustering environment?
 
Aside these, what would be the use cases for this scenario, of having redundancy at MAC level?
The only use case I can think of is when you don’t want cluster’s “clients” (routers, switches, rarely client
machines) to update their own ARP caches (after a successful IP address takeover). But this is only
a synthetic example, I don’t see it as a real-life scenario.
 
Your feedback is highly appreciated.
 
Warm regards,
Stefan
 
 
 

_______________________________________________
Lf_carrier mailing list
Lf_carrier@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/lf_carrier

_______________________________________________
Lf_carrier mailing list
Lf_carrier@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/lf_carrier

[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]     [Asterisk PBX]

  Powered by Linux