David Teigland wrote:
Here's how it is now.Above the names were "hosting-cl02-01" and "hosting-cl02-02". Could you clear that up and if there are still problems send your cluster.conf file? Thanks
Using new hostnames and cluster.conf (blade center's IP address and community string removed):
==================================
<?xml version="1.0"?>
<cluster name="cluster" config_version="3">
<cman two_node="1" expected_votes="1"> </cman>
<clusternodes> <clusternode name="cluster-node2" votes="1"> <fence> <method name="single"> <device name="ibmblade" port="7"/> </method> </fence> </clusternode> <clusternode name="cluster-node1" votes="1"> <fence> <method name="single"> <device name="ibmblade" port="6"/> </method> </fence> </clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="ibmblade" agent="fence_ibmblade" ipaddr="IP_ADDRESS_HERE" community="COMMUNITY_HERE"/>
</fencedevices>
</cluster> ===========================================
Commands and their output (console or syslog):
# modprobe gfs # modprobe lock_dlm
Feb 15 15:10:04 cluster-node1 Lock_Harness <CVS> (built Feb 15 2005 12:00:38) installed
Feb 15 15:10:04 cluster-node1 GFS <CVS> (built Feb 15 2005 12:00:52) installed
Feb 15 15:10:08 cluster-node1 CMAN <CVS> (built Feb 15 2005 12:00:31) installed
Feb 15 15:10:08 cluster-node1 NET: Registered protocol family 30
Feb 15 15:10:08 cluster-node1 DLM <CVS> (built Feb 15 2005 12:00:34) installed
Feb 15 15:10:08 cluster-node1 Lock_DLM (built Feb 15 2005 12:00:39) installed
dm-mod is built-in in the kernel (not a module)
# ccsd -V ccsd DEVEL.1108443619 (built Feb 15 2005 12:01:01) Copyright (C) Red Hat, Inc. 2004 All rights reserved.
# ccsd -4
Feb 15 15:10:58 cluster-node1 ccsd[8556]: Starting ccsd DEVEL.1108443619:
Feb 15 15:10:58 cluster-node1 ccsd[8556]: Built: Feb 15 2005 12:01:01
Feb 15 15:10:58 cluster-node1 ccsd[8556]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Feb 15 15:10:58 cluster-node1 ccsd[8556]: IP Protocol:: IPv4 only
# cman_tool join
Feb 15 15:12:27 cluster-node1 ccsd[8556]: cluster.conf (cluster name = cluster, version = 3) found.
Feb 15 15:12:28 cluster-node1 CMAN: Waiting to join or form a Linux-cluster
Feb 15 15:12:28 cluster-node1 ccsd[8558]: Connected to cluster infrastruture via: CMAN/SM Plugin v1.1
Feb 15 15:12:28 cluster-node1 ccsd[8558]: Initial status:: Inquorate
Feb 15 15:13:00 cluster-node1 CMAN: forming a new cluster
Feb 15 15:13:00 cluster-node1 CMAN: quorum regained, resuming activity
Feb 15 15:13:00 cluster-node1 ccsd[8558]: Cluster is quorate. Allowing connections.
# cman_tool status Protocol version: 5.0.1 Config version: 3 Cluster name: cluster Cluster ID: 13364 Membership state: Cluster-Member Nodes: 1 Expected_votes: 1 Total_votes: 1 Quorum: 1 Active subsystems: 0 Node name: cluster-node1 Node addresses: 192.168.192.146
# cman_tool nodes Node Votes Exp Sts Name 1 1 1 M cluster-node1
# fence_tool join
Feb 15 15:14:26 cluster-node1 fenced[8847]: cluster-node2 not a cluster member after 6 sec post_join_delay
Feb 15 15:14:26 cluster-node1 fenced[8847]: fencing node "cluster-node2"
Feb 15 15:14:32 cluster-node1 fenced[8847]: fence "cluster-node2" success
at this point "cluster-node2" was fenced and automatically rebooted, which is good.
Now I join the cluster-node2 to the cluster : # modprobe gfs # modprobe lock_dlm # cman_tool join # fence_tool join
Feb 15 15:18:30 cluster-node2 ccsd[8376]: Starting ccsd DEVEL.1108443619:
Feb 15 15:18:30 cluster-node2 ccsd[8376]: Built: Feb 15 2005 12:01:01
Feb 15 15:18:30 cluster-node2 ccsd[8376]: Copyright (C) Red Hat, Inc. 2004 All rights reserved.
Feb 15 15:18:30 cluster-node2 ccsd[8376]: IP Protocol:: IPv4 only
Feb 15 15:18:34 cluster-node2 ccsd[8376]: cluster.conf (cluster name = cluster, version = 3) found.
Feb 15 15:18:34 cluster-node2 ccsd[8376]: Remote copy of cluster.conf is from quorate node.
Feb 15 15:18:34 cluster-node2 ccsd[8376]: Local version # : 3
Feb 15 15:18:34 cluster-node2 ccsd[8376]: Remote version #: 3
Feb 15 15:18:41 cluster-node2 Lock_Harness <CVS> (built Feb 15 2005 12:00:38) installed
Feb 15 15:18:41 cluster-node2 GFS <CVS> (built Feb 15 2005 12:00:52) installed
Feb 15 15:18:44 cluster-node2 CMAN <CVS> (built Feb 15 2005 12:00:31) installed
Feb 15 15:18:44 cluster-node2 NET: Registered protocol family 30
Feb 15 15:18:44 cluster-node2 DLM <CVS> (built Feb 15 2005 12:00:34) installed
Feb 15 15:18:44 cluster-node2 Lock_DLM (built Feb 15 2005 12:00:39) installed
Feb 15 15:18:47 cluster-node2 ccsd[8376]: Remote copy of cluster.conf is from quorate node.
Feb 15 15:18:47 cluster-node2 ccsd[8376]: Local version # : 3
Feb 15 15:18:47 cluster-node2 ccsd[8376]: Remote version #: 3
Feb 15 15:18:47 cluster-node2 CMAN: Waiting to join or form a Linux-cluster
Feb 15 15:18:48 cluster-node2 ccsd[8378]: Connected to cluster infrastruture via: CMAN/SM Plugin v1.1
Feb 15 15:18:48 cluster-node2 ccsd[8378]: Initial status:: Inquorate
Feb 15 15:18:50 cluster-node2 CMAN: sending membership request
Feb 15 15:18:50 cluster-node2 CMAN: got node cluster-node1
Feb 15 15:18:50 cluster-node2 CMAN: quorum regained, resuming activity
Feb 15 15:18:50 cluster-node2 ccsd[8378]: Cluster is quorate. Allowing connections.
on node 1 :
# clvmd
Feb 15 15:24:56 cluster-node1 CMAN: WARNING no listener for port 11 on node cluster-node2
on node 2 :
# clvmd
Feb 15 15:25:03 cluster-node2 clvmd: Cluster LVM daemon started - connected to CMAN
on node 1 : # cman_tool nodes Node Votes Exp Sts Name 1 1 1 M cluster-node1 2 1 1 M cluster-node2
# cman_tool services Service Name GID LID State Code Fence Domain: "default" 1 2 run - [1 2]
DLM Lock Space: "clvmd" 3 3 run - [1 2]
# cman_tool status Protocol version: 5.0.1 Config version: 3 Cluster name: cluster Cluster ID: 13364 Membership state: Cluster-Member Nodes: 2 Expected_votes: 1 Total_votes: 2 Quorum: 1 Active subsystems: 3 Node name: cluster-node1 Node addresses: 192.168.192.146
Now I shutdown node2's network interface.
On node 2 : # ifconfig eth0 down
On node 1 :
Feb 15 15:29:50 cluster-node1 CMAN: removing node cluster-node2 from the cluster : Missed too many heartbeats
# cman_tool status Protocol version: 5.0.1 Config version: 3 Cluster name: cluster Cluster ID: 13364 Membership state: Cluster-Member Nodes: 2 Expected_votes: 1 Total_votes: 2 Quorum: 1 Active subsystems: 3 Node name: cluster-node1 Node addresses: 192.168.192.146
# cman_tool status Protocol version: 5.0.1 Config version: 3 Cluster name: cluster Cluster ID: 13364 Membership state: Cluster-Member Nodes: 1 Expected_votes: 1 Total_votes: 1 Quorum: 1 Active subsystems: 3 Node name: cluster-node1 Node addresses: 192.168.192.146
# cman_tool nodes Node Votes Exp Sts Name 1 1 1 M cluster-node1 2 1 1 X cluster-node2
# cman_tool services Service Name GID LID State Code Fence Domain: "default" 1 2 run - [1 2]
DLM Lock Space: "clvmd" 3 3 run - [1 2]
No note about fencing whatsoever, and node 2 is not automatically rebooted. Shouldn't node 2 get fenced here?
Regards,
Fajar