Hi together!
I am experiencing sporadic problems with my cluster setup. Maybe someone
has an idea? But first some facts:
Type: RHEL 6.1 two node cluster (corosync 1.2.3-36) on two Dell R610
each with a quad port NIC
NICs:
- interfaces em1/em2 are bonded using mode 5; these interfaces are cross
connected (intended to be used for the cluster housekeeping
communication) - no network element in between
- interfaces em3/em4 are bonded using mode 1; these interfaces are
connected to two switches
Cluster configuration:
<?xml version="1.0"?>
<cluster config_version="51" name="my-cluster">
<cman expected_votes="1" two_node="1"/>
<clusternodes>
<clusternode name="df1-clusterlink" nodeid="1">
<fence>
<method name="VBoxManage-DF-1">
<device name="VBoxManage-DF-1" />
</method>
</fence>
<unfence>
</unfence>
</clusternode>
<clusternode name="df2-clusterlink" nodeid="2">
<fence>
<method name="VBoxManage-DF-2">
<device name="VBoxManage-DF-2" />
</method>
</fence>
<unfence>
</unfence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="VBoxManage-DF-1" agent="fence_vbox"
vboxhost="vboxhost.private" login="test" vmname="RHEL 6.1 x86_64
DF-System Server 1" />
<fencedevice name="VBoxManage-DF-2" agent="fence_vbox"
vboxhost="vboxhost.private" login="test" vmname="RHEL 6.1 x86_64
DF-System Server 2" />
</fencedevices>
<rm>
<resources>
<ip address="10.200.104.15/27" monitor_link="on"
sleeptime="10"/>
<script file="/usr/share/cluster/app.sh" name="myapp"/>
</resources>
<failoverdomains>
<failoverdomain name="fod-myapp" nofailback="0" ordered="1"
restricted="0">
<failoverdomainnode name="df1-clusterlink" priority="1"/>
<failoverdomainnode name="df2-clusterlink" priority="2"/>
</failoverdomain>
</failoverdomains>
<service domain="fod-myapp" exclusive="1" max_restarts="3"
name="rg-myapp" recovery="restart" restart_expire_time="1">
<script ref=myapp"/>
<ip ref="10.200.104.15/27"/>
</service>
</rm>
<logging debug="on"/>
<gfs_controld enable_plock="0" plock_rate_limit="0"/>
<dlm enable_plock="0" plock_ownership="1" plock_rate_limit="0"/>
</cluster>
--------------------------------------------------------------------------------
Problem:
Sometimes the second node "detects" that the token has been lost
(corosync.log):
[no TOTEM messages before that]
Jul 28 13:00:10 corosync [TOTEM ] The token was lost in the OPERATIONAL
state.
Jul 28 13:00:10 corosync [TOTEM ] A processor failed, forming new
configuration.
Jul 28 13:00:10 corosync [TOTEM ] Receive multicast socket recv buffer
size (262142 bytes).
Jul 28 13:00:10 corosync [TOTEM ] Transmit multicast socket send buffer
size (262142 bytes).
This happens lets say once a week. This leads to fencing of the first
node. What I see from 'corosync-objctl -a' is that this is maybe due to
a consensus timeout (some excerpt from the commands output follows); I
marked the lines which I so far consider as important:
totem.transport=udp
totem.version=2
totem.nodeid=2
totem.vsftype=none
totem.token=10000
totem.join=60
totem.fail_recv_const=2500
totem.consensus=2000
totem.rrp_mode=none
totem.secauth=1
totem.key=my-cluster
totem.interface.ringnumber=0
totem.interface.bindnetaddr=172.16.42.2
totem.interface.mcastaddr=239.192.187.168
totem.interface.mcastport=5405
runtime.totem.pg.mrp.srp.orf_token_tx=3
runtime.totem.pg.mrp.srp.orf_token_rx=1103226
runtime.totem.pg.mrp.srp.memb_merge_detect_tx=395
runtime.totem.pg.mrp.srp.memb_merge_detect_rx=1098359
runtime.totem.pg.mrp.srp.memb_join_tx=38
runtime.totem.pg.mrp.srp.memb_join_rx=50
runtime.totem.pg.mrp.srp.mcast_tx=218
runtime.totem.pg.mrp.srp.mcast_retx=0
runtime.totem.pg.mrp.srp.mcast_rx=541
runtime.totem.pg.mrp.srp.memb_commit_token_tx=12
runtime.totem.pg.mrp.srp.memb_commit_token_rx=18
runtime.totem.pg.mrp.srp.token_hold_cancel_tx=49
runtime.totem.pg.mrp.srp.token_hold_cancel_rx=173
runtime.totem.pg.mrp.srp.operational_entered=6
runtime.totem.pg.mrp.srp.operational_token_lost=1
^^^
runtime.totem.pg.mrp.srp.gather_entered=7
runtime.totem.pg.mrp.srp.gather_token_lost=0
runtime.totem.pg.mrp.srp.commit_entered=6
runtime.totem.pg.mrp.srp.commit_token_lost=0
runtime.totem.pg.mrp.srp.recovery_entered=6
runtime.totem.pg.mrp.srp.recovery_token_lost=0
runtime.totem.pg.mrp.srp.consensus_timeouts=1
^^^
runtime.totem.pg.mrp.srp.mtt_rx_token=1727
runtime.totem.pg.mrp.srp.avg_token_workload=62244458
runtime.totem.pg.mrp.srp.avg_backlog_calc=0
runtime.totem.pg.mrp.srp.rx_msg_dropped=0
runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(172.16.42.2)
runtime.totem.pg.mrp.srp.members.2.join_count=1
runtime.totem.pg.mrp.srp.members.2.status=joined
runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(172.16.42.1)
runtime.totem.pg.mrp.srp.members.1.join_count=3
runtime.totem.pg.mrp.srp.members.1.status=joined
runtime.blackbox.dump_flight_data=no
runtime.blackbox.dump_state=no
Some questions at this point:
A) why did the cluster lose the token? due to timeout? token (10000) or
consensus (2000)?
B) why is the timeout ellapsed? maybe that is connected with the answer
to A ... ?
C) is it normal that 'token=10000' and 'consensus=2000' although normal
documentation says that default is 'token=1000' and 'consensus=1.2*token'?
D) since I suspect problems concerning the switches connecting the other
interfaces (em3/em4 bonded to bond0) of those machines I wonder whether
any traffic goes that way and not via bond1?
As I already stated: the connection of em3/em4 is a direct one without
any network element.
So far I want to add the following line to cluster.conf and see whether
the situation improves:
<totem token_retransmits_before_loss_const="10"
fail_recv_const="100" consensus="12000"/>
Any comment concerning that?
While googling for reasons I have seen that it is also a problem if both
nodes are not synchronized concerning time; but in my case the ntpd on
both nodes uses two stratum 2 NTP servers. I also cannot detect anything
unsual like e.g. a jump of multiple seconds inside the log files
although I have to admit that so far the ntpd does not run with debug
enabled.
Thanks in advance for any hint or comment!
Kind regards,
Heiko
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster