Hi,
Have not measured it as we have been running this way for years
now and haven't experienced any problems with "transport endpoint
is not connected” with this setup.
We used the default options "BONDING_OPTS='mode=6 miimon=100'"
-
miimon=time_in_milliseconds
-
Specifies (in milliseconds) how often MII
link monitoring occurs. This is useful if high availability
is required because MII is used to verify that the NIC is
active.
On 2/25/19 2:22 PM, Martin Toth wrote:
How long does it take to your devices (using mode 5 or 6, ALB is
prefered for GlusterFS) to take-over the MAC? This can result in
your error - "transport endpoint is not connected” - there are
some timeouts within gluster set by default.
I am using LACP and it works without any problem.
Can you share your mode 5 / 6 configuration ?
Thanks.
Martin
Hi,
Well no, mode 5 and mode 6 also have fault
tollerance and don't need any special switch config.
Quick google search:
https://serverfault.com/questions/734246/does-balance-alb-and-balance-tlb-support-fault-tolerance
Bonding Mode 5 (balance-tlb) works by
looking at all the devices in the bond, and
sending out the slave with the least current
traffic load. Traffic is only received by one
slave (the "primary slave"). If a slave is lost,
that slave is not considered for transmission, so
this mode is fault-tolerant.
Bonding Mode 6 (balance-alb) works as
above, except incoming ARP requests are
intercepted by the bonding driver, and the bonding
driver generates ARP replies so that external
hosts are tricked into sending their traffic into
one of the other bonding slaves instead of the
primary slave. If many hosts in the same broadcast
domain contact the bond, then traffic should
balance roughly evenly into all slaves.
If a slave is lost in Mode 6, then it
may take some time for a remote host to time out
its ARP table entry and send a new ARP request. A
TCP or SCTP retransmission tents to lead into ARP
request fairly quickly, but a UDP datagram does
not, and will rely on the usual ARP table refresh.
So Mode 6 is fault tolerant,
but convergence on slave loss may take some time
depending on the Layer 4 protocol used.
If you are worried about fast fault
tolerance, then consider using Mode 4 (802.3ad aka
LACP) which negotiates link aggregation between
the bond and the switch, and constantly updates
the link status between the aggregation partners.
Mode 4 also has configurable load balance hashing
so is better for in-order delivery of TCP streams
compared to Mode 5 or Mode 6.
https://wiki.linuxfoundation.org/networking/bonding
-
balance-tlb or
5
Adaptive transmit load balancing: channel
bonding that does not require any special switch
support. The outgoing traffic is distributed
according to the current load (computed relative
to the speed) on each slave. Incoming traffic is
received by the current slave. If
the receiving slave fails, another slave takes
over the MAC address of the failed receiving
slave.
-
Prerequisite:
-
Ethtool support in the
base drivers for retrieving the speed of
each slave.
-
balance-alb or
6
Adaptive load balancing: includes
balance-tlb plus receive load balancing
(rlb) for IPV4 traffic, and does not require any
special switch support. The receive load
balancing is achieved by ARP negotiation.
-
The bonding driver intercepts
the ARP Replies sent by the local system on
their way out and overwrites the source
hardware address with the unique hardware
address of one of the slaves in the bond
such that different peers use different
hardware addresses for the server.
-
Receive traffic from
connections created by the server is also
balanced. When the local system sends an ARP
Request the bonding driver copies and saves
the peer's IP information from the ARP
packet.
-
When the ARP Reply arrives
from the peer, its hardware address is
retrieved and the bonding driver initiates
an ARP reply to this peer assigning it to
one of the slaves in the bond.
-
A problematic outcome of
using ARP negotiation for balancing is that
each time that an ARP request is broadcast
it uses the hardware address of the bond.
Hence, peers learn the hardware address of
the bond and the balancing of receive
traffic collapses to the current slave. This
is handled by sending updates (ARP Replies)
to all the peers with their individually
assigned hardware address such that the
traffic is redistributed. Receive traffic is
also redistributed when a new slave is added
to the bond and when an inactive slave is
re-activated. The receive load is
distributed sequentially (round robin) among
the group of highest speed slaves in the
bond.
-
When a link is reconnected or
a new slave joins the bond the receive
traffic is redistributed among all active
slaves in the bond by initiating ARP Replies
with the selected mac address to each of the
clients. The updelay parameter (detailed
below) must be set to a value equal or
greater than the switch's forwarding delay
so that the ARP Replies sent to the peers
will not be blocked by the switch.
On 2/25/19 1:16 PM,
Martin Toth wrote:
Hi Alex,
you have to use bond mode 4 (LACP -
802.3ad) in order to achieve redundancy of
cables/ports/switches. I suppose this is what you
want.
BR,
Martin
Hi All,
I was asking if it is
possible to have the two separate
cables connected to two different
physical switched. When trying mode6
or mode1 in this setup gluster was
refusing to start the volumes, giving
me "transport endpoint is not
connected".
server1: cable1
---------------- switch1
--------------------- server2: cable1
|
server1: cable2
---------------- switch2
--------------------- server2: cable2
Both switches are
connected with each other also. This
is done to achieve redundancy for the
switches.
When disconnecting cable2
from both servers, then gluster was
happy.
What could be the problem?
Thanx,
Alex
Hi,
We use bonding mode 6
(balance-alb) for GlusterFS
traffic
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4
Preferred bonding mode
for Red Hat Gluster Storage
client is mode 6 (balance-alb),
this allows client to transmit
writes in parallel on separate
NICs much of the time.
Regards,
Jorick Astrego
On
2/25/19 5:41 AM, Dmitry Melekhov
wrote:
23.02.2019
19:54, Alex K пишет:
Hi all,
I have a replica
3 setup where each server
was configured with a dual
interfaces in mode 6
bonding. All cables were
connected to one common
network switch.
To add
redundancy to the switch,
and avoid being a single
point of failure, I
connected each second cable
of each server to a second
switch. This turned out to
not function as gluster was
refusing to start the volume
logging "transport endpoint
is disconnected" although
all nodes were able to reach
each other (ping) in the
storage network. I switched
the mode to mode 1
(active/passive) and
initially it worked but
following a reboot of all
cluster same issue appeared.
Gluster is not starting the
volumes.
Isn't
active/passive supposed to
work like that? Can one have
such redundant network setup
or are there any other
recommended approaches?
Yes, we use lacp, I
guess this is mode 4 ( we use
teamd ), it is, no doubt, best
way.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
Met
vriendelijke groet, With kind
regards,
Jorick Astrego
Netbulae Virtualization Experts
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
Met vriendelijke groet, With kind
regards,
Jorick Astrego
Netbulae Virtualization Experts
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
Met vriendelijke groet, With kind regards,
Jorick Astrego
Netbulae Virtualization Experts
Tel: 053 20 30 270 | info@xxxxxxxxxxx | Staalsteden 4-3A | KvK 08198180 | Fax: 053 20 30 271 | www.netbulae.eu | 7547 TA Enschede | BTW NL821234584B01 |
|