Hi.
you got some errors on your cluster.conf file.
1. you must check that fence_rsa works berfore starting the cluster.
2. if you are using quorumd, change cman to : <cman expected_votes="3" two_node="0"/>
3. put quorumd votes=1 , min_score=1
4. change your heuristic program to somthing like ping to your router.( its better to add more heuristics )
5. install most updated rpms of cman openais & rgmanager.
6. clustat should show qdisk is online. cman should start qdiskd .
for more information you can read :
http://sources.redhat.com/cluster/wiki/FAQ/CMAN ( it helps me . )
Regards
Shalom.
you got some errors on your cluster.conf file.
1. you must check that fence_rsa works berfore starting the cluster.
2. if you are using quorumd, change cman to : <cman expected_votes="3" two_node="0"/>
3. put quorumd votes=1 , min_score=1
4. change your heuristic program to somthing like ping to your router.( its better to add more heuristics )
5. install most updated rpms of cman openais & rgmanager.
6. clustat should show qdisk is online. cman should start qdiskd .
for more information you can read :
http://sources.redhat.com/cluster/wiki/FAQ/CMAN ( it helps me . )
Regards
Shalom.
On Mon, Mar 8, 2010 at 12:44 PM, <mogruith@xxxxxxx> wrote:
Hi all
Here is my cluster.conf:
<?xml version="1.0"?>
<cluster config_version="6" name="TEST">
<quorumd device="/dev/vg1quorom/lv1quorom" interval="1" label="quorum"
min_score="3" tko="10" votes="3">
<heuristic interval="2" program="/usr/sbin/qdiskd" score="1"/>
</quorumd>
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<cman expected_votes="1" two_node="1"/>
<clusternodes>
<clusternode name="node1" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="RSA_node1"/>
</method>
</fence>
</clusternode>
<clusternode name="node2" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="RSA_node2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_rsa" ipaddr="RSA_node1" login="USER"
name="RSA_node1" passwd="PASSWORD"/>
<fencedevice agent="fence_rsa" ipaddr="RSA_node2" login="USER"
name="RSA_node2" passwd="PASSWORD"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="TEST" ordered="1" restricted="1">
<failoverdomainnode name="node1" priority="1"/>
<failoverdomainnode name="node2" priority="2"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="172.28.104.80" monitor_link="1"/>
<clusterfs device="/dev/vg1data/lv1data"
force_unmount="0" fsid="30516" fstype="gfs2" mountpoint="/data" name="DATA"
options=""/>
</resources>
<service autostart="1" domain="TEST" exclusive="1" name="TEST">
<ip ref="172.28.104.80">
<clusterfs ref="DATA"/>
</ip>
</service>
</rm>
</cluster>
N.B
node1, node2 , RSA_node1 and RSA_node2 are set in /etc/hosts
When I move service from node1 to node2 (by a force reboot on node1), it fails
(because of probably a network problem) but is there a timeout ? If node2 can't
connect to rsa node1, why it doesnt consider that node1is "dead" and why service
doesn't go on node2 ?
Here is the clustat
[root@node2 ~]# clustat
Cluster Status for TEST @ Mon Mar 8 11:33:32 2010
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
node1 1 Offline
node2 2 Online, Local, rgmanager
Service Name Owner (Last)
State
------- ---- ----- ------
-----
service:TEST node1
stopping
It's stopping like that since 30min !
Here is the log:
Mar 8 11:35:45 node2 fenced[7038]: agent "fence_rsa" reports: Unable to
connect/login to fencing device
Mar 8 11:35:45 node2 fenced[7038]: fence "node1" failed
Mar 8 11:35:50 node2 fenced[7038]: fencing node "node1"
Mar 8 11:35:56 node2 fenced[7038]: agent "fence_rsa" reports: Unable to
connect/login to fencing device
Mar 8 11:35:56 node2 fenced[7038]: fence "node1" failed
Why node2 is still trying to fence node1 ?
Here is something else :
[root@node2 ~]# cman_tool services
type level name id state
fence 0 default 00010001 FAIL_START_WAIT
[2]
dlm 1 rgmanager 00020001 FAIL_ALL_STOPPED
[1 2]
How to verify quorum is used ?
Last question : I have 3 networks (6 nic, 3 bonding), one is dedicated for
heartbeat. where I have to set it in cluster.conf ? I would like node1 and node2
communicate by their own bond3 .
Thanks for your help.
mog
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster