Hello everyone, I'm a new man on linux cluster. I have built a two-node cluster (without qdisk), includes: Redhat 6.4 cman pacemaker gfs2 My cluster could fail-over (back and forth) between two nodes for these 3 resources: ClusterIP, WebFS (Filesystem GFS2 mount /dev/sdc on /mnt/gfs2_storage), WebSite ( apache service) My problem occurs when I stop/start node in the following order: (when both nodes started) 1. Stop: node1 (shutdown) -> all resource fail-over on node2 -> all resources still working on node2 2. Stop: node2 (stop service: pacemaker then cman) -> all resources stop (of course) 3. Start: node1 (start service: cman then pacemaker) -> only ClusterIP started, WebFS failed, WebSite not started Status: Last updated: Mon Jun 16 18:34:56 2014 Last change: Mon Jun 16 14:24:54 2014 via cibadmin on server1 Stack: cman Current DC: server1 - partition WITHOUT quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 1 expected votes 4 Resources configured. Online: [ server1 ] OFFLINE: [ server2 ] ClusterIP (ocf::heartbeat:IPaddr2): Started server1 WebFS (ocf::heartbeat:Filesystem): Started server1 (unmanaged) FAILED Failed actions: WebFS_stop_0 (node=server1, call=32, rc=1, status=Timed Out): unknown error Here is my /etc/cluster/cluster.conf <?xml version="1.0"?> <cluster config_version="1" name="mycluster"> <logging debug="on"/> <clusternodes> <clusternode name="server1" nodeid="1"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="server1"/> </method> </fence> </clusternode> <clusternode name="server2" nodeid="2"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="server2"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice name="pcmk" agent="fence_pcmk"/> </fencedevices> </cluster> Here is my: crm configure show node server1 node server2 primitive ClusterIP IPaddr2 \ params ip=192.168.117.130 cidr_netmask=32 \ op monitor interval=10s primitive WebFS Filesystem \ params device="/dev/sdc" directory="/mnt/gfs2_datastore" fstype=gfs2 \ meta target-role=Started primitive WebSite1 apache \ params configfile="/mnt/nfs_datastore/httpd/conf/httpd.conf" statusurl="http://localhost/server-status" \ op monitor interval=40s \ meta target-role=Stopped primitive WebSite2 apache \ params configfile="/mnt/gfs2_datastore/httpd/conf/httpd.conf" statusurl="http://localhost/server-status" \ op monitor interval=40s \ meta target-role=Started colocation webfs-with-ip inf: WebFS ClusterIP colocation website-with-webfs inf: WebSite2 WebFS order webfs-after-clusterip inf: ClusterIP WebFS order website-after-webfs inf: WebFS WebSite2 property cib-bootstrap-options: \ dc-version=1.1.8-7.el6-394e906 \ cluster-infrastructure=cman \ stonith-enabled=false \ no-quorum-policy=ignore \ expected-quorum-votes=1 \ last-lrm-refresh=1402374391 rsc_defaults rsc-options: \ resource-stickiness=100 rsc_defaults rsc_defaults-options: \ resource-stickiness=100 op_defaults op_defaults-options: \ migration-threshold=1 I don't have any glues to trace down this case, I just guess this problem comes from locking file system, please suggest me some advices. Thank you. Kien Le. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster