Hi Digimer, Here is from the logs: [root@ustlvcmsp1954 ~]# tail -f /var/log/messages Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine loaded: corosync profile loading service Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [QUORUM] Using quorum provider quorum_cman Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [QUORUM] Members[1]: 1 Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [QUORUM] Members[1]: 1 Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [CPG ] chosen downlist: sender r(0) ip(10.30.197.108) ; members(old:0 left:0) Jan 7 16:14:01 ustlvcmsp1954 corosync[8182]: [MAIN ] Completed service synchronization, ready to provide service. Jan 7 16:14:01 ustlvcmsp1954 rgmanager[8099]: Waiting for quorum to form Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Unloading all Corosync service engines. Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine unloaded: corosync extended virtual synchrony service Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine unloaded: corosync configuration service Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine unloaded: corosync cluster closed process group service v1.01 Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine unloaded: corosync cluster config database access v1.01 Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine unloaded: corosync profile loading service Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine unloaded: openais checkpoint service B.01.01 Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine unloaded: corosync CMAN membership service 2.90 Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [SERV ] Service engine unloaded: corosync cluster quorum service v0.1 Jan 7 16:15:06 ustlvcmsp1954 corosync[8182]: [MAIN ] Corosync Cluster Engine exiting with status 0 at main.c:2055. Jan 7 16:15:06 ustlvcmsp1954 rgmanager[8099]: Quorum formed Then it die at: Starting cman... [ OK ] Waiting for quorum... Timed-out waiting for cluster [FAILED] Yes, I did made changes with: <fence_daemon post_join_delay="30"/> the problem is still there. One thing I don't know why cluster is looking for quorum? I did have any disk quorum setup in cluster.conf file. Any helps can I get appreciated. Vinh -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Digimer Sent: Wednesday, January 07, 2015 3:59 PM To: linux clustering Subject: Re: needs helps GFS2 on 5 nodes cluster On 07/01/15 03:39 PM, Cao, Vinh wrote: > Hello Digimer, > > Yes, I would agrre with you RHEL6.4 is old. We patched monthly, but I'm not sure why these servers are still at 6.4. Most of our system are 6.6. > > Here is my cluster config. All I want is using cluster to have BGFS2 mount via /etc/fstab. > root@ustlvcmsp1955 ~]# cat /etc/cluster/cluster.conf <?xml > version="1.0"?> <cluster config_version="15" name="p1954_to_p1958"> > <clusternodes> > <clusternode name="ustlvcmsp1954" nodeid="1"/> > <clusternode name="ustlvcmsp1955" nodeid="2"/> > <clusternode name="ustlvcmsp1956" nodeid="3"/> > <clusternode name="ustlvcmsp1957" nodeid="4"/> > <clusternode name="ustlvcmsp1958" nodeid="5"/> > </clusternodes> You don't configure the fencing for the nodes... If anything causes a fence, the cluster will lock up (by design). > <fencedevices> > <fencedevice agent="fence_vmware_soap" ipaddr="10.30.197.108" login="rhfence" name="p1954" passwd="xxxxxxxx"/> > <fencedevice agent="fence_vmware_soap" ipaddr="10.30.197.109" login="rhfence" name="p1955" passwd=" xxxxxxxx "/> > <fencedevice agent="fence_vmware_soap" ipaddr="10.30.197.110" login="rhfence" name="p1956" passwd=" xxxxxxxx "/> > <fencedevice agent="fence_vmware_soap" ipaddr="10.30.197.111" login="rhfence" name="p1957" passwd=" xxxxxxxx "/> > <fencedevice agent="fence_vmware_soap" ipaddr="10.30.197.112" login="rhfence" name="p1958" passwd=" xxxxxxxx "/> > </fencedevices> > </cluster> > > clustat show: > > Cluster Status for p1954_to_p1958 @ Wed Jan 7 15:38:00 2015 Member > Status: Quorate > > Member Name ID Status > ------ ---- ---- ------ > ustlvcmsp1954 1 Offline > ustlvcmsp1955 2 Online, Local > ustlvcmsp1956 3 Online > ustlvcmsp1957 4 Offline > ustlvcmsp1958 5 Online > > I need to make them all online, so I can use fencing for mounting shared disk. > > Thanks, > Vinh What about the log entries from the start-up? Did you try the post_join_delay config? > -----Original Message----- > From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Digimer > Sent: Wednesday, January 07, 2015 3:16 PM > To: linux clustering > Subject: Re: needs helps GFS2 on 5 nodes cluster > > My first though would be to set <fence_daemon post_join_delay="30" /> in cluster.conf. > > If that doesn't work, please share your configuration file. Then, with all nodes offline, open a terminal to each node and run 'tail -f -n 0 /var/log/messages'. With that running, start all the nodes and wait for things to settle down, then paste the five nodes' output as well. > > Also, 6.4 is pretty old, why not upgrade to 6.6? > > digimer > > On 07/01/15 03:10 PM, Cao, Vinh wrote: >> Hello Cluster guru, >> >> I'm trying to setup Redhat 6.4 OS cluster with 5 nodes. With two nodes >> I don't have any issue. >> >> But with 5 nodes, when I ran clustat I got 3 nodes online and the >> other two off line. >> >> When I start the one that are off line. Service cman start. I got: >> >> [root@ustlvcmspxxx ~]# service cman status >> >> corosync is stopped >> >> [root@ustlvcmsp1954 ~]# service cman start >> >> Starting cluster: >> >> Checking if cluster has been disabled at boot... [ OK ] >> >> Checking Network Manager... [ OK ] >> >> Global setup... [ OK ] >> >> Loading kernel modules... [ OK ] >> >> Mounting configfs... [ OK ] >> >> Starting cman... [ OK ] >> >> Waiting for quorum... Timed-out waiting for cluster >> >> [FAILED] >> >> Stopping cluster: >> >> Leaving fence domain... [ OK ] >> >> Stopping gfs_controld... [ OK ] >> >> Stopping dlm_controld... [ OK ] >> >> Stopping fenced... [ OK ] >> >> Stopping cman... [ OK ] >> >> Waiting for corosync to shutdown: [ OK ] >> >> Unloading kernel modules... [ OK ] >> >> Unmounting configfs... [ OK ] >> >> Can you help? >> >> Thank you, >> >> Vinh >> >> >> > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster