Hi, During some tests I got errors like "<emerg> #1: Quorum Dissolved" ... My cluster has 6 nodes that are virtual services running on 2 physical nodes. On each node there is 3 virtual services: Member Name ID Status ------ ---- ---- ------ w2.local 1 Online, Local, rgmanager w1.local 2 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- vm:VM_Work11_RHEL51 w1.local started vm:VM_Work12_RHEL51 w1.local started vm:VM_Work13_RHEL51 w1.local started vm:VM_Work21_RHEL51 w2.local started vm:VM_Work22_RHEL51 w2.local started vm:VM_Work23_RHEL51 w2.local started On the 6-node cluster I runnig 2 httpd services (in restricted failover domain). Member Status: Quorate Member Name ID Status ------ ---- ---- ------ w11.local 1 Online, rgmanager w12.local 2 Online, rgmanager w13.local 3 Online, rgmanager w21.local 4 Online, Local, rgmanager w22.local 5 Online, rgmanager w23.local 6 Online, rgmanager /dev/xvdd1 0 Online, Quorum Disk Service Name Owner (Last) State ------- ---- ----- ------ ----- service: httpd_w11 w11.local started service: httpd_w21 w21.local started After shutting down w11.local node this cluster should run normallly because there is still qourum ( qourum device has 5 votes). The httpd_w11 service should be down but the httpd_w21 service should be up (the w21.local node is runnig). That not happens. On w21.local I get error that qourum is dissolved and cluster is not quorate. It takes a time the cluster is again qourate. During the time rgmanager is not working and service httpd_w21 is down. After gaining qourum I get error: Mar 11 16:16:20 w21 clurgmgrd[1946]: <err> #34: Cannot get status for service service:httpd_w21 When all members of cluster are online the clustat shows: 1. on w21.local Member Status: Quorate Member Name ID Status ------ ---- ---- ------ w11.local 1 Online, rgmanager w12.local 2 Online, rgmanager w13.local 3 Online, rgmanager w21.local 4 Online, Local, rgmanager w22.local 5 Online, rgmanager w23.local 6 Online, rgmanager /dev/xvdd1 0 Online, Quorum Disk Service Name Owner (Last) State ------- ---- ----- ------ ----- service:httpd_w11 w11.local started 2. on w11.local, w12.local, w13.local that were fenced: Member Status: Quorate Member Name ID Status ------ ---- ---- ------ w11.local 1 Online, Local w12.local 2 Online w13.local 3 Online w21.local 4 Online w22.local 5 Online w23.local 6 Online /dev/xvdd1 0 Online, Quorum Disk clustat shows that rgmanager is not running. But in the logs there is: Mar 11 14:56:27 w11 Mar 11 14:56:37 w11 clurgmgrd[1942]: <err> #34: Cannot get status for service service:httpd_w11 Mar 11 14:56:37 w11 clurgmgrd[1942]: <err> #34: Cannot get status for service service:httpd_w21 clurgmgrd[1942]: <notice> Resource Group Manager Starting 3. on w23.local: Member Status: Quorate Member Name ID Status ------ ---- ---- ------ w11.local 1 Online, rgmanager w12.local 2 Online, rgmanager w13.local 3 Online, rgmanager w21.local 4 Online, rgmanager w22.local 5 Online, rgmanager w23.local 6 Online, Local, rgmanager /dev/xvdd1 0 Online, Quorum Disk Service Name Owner (Last) State ------- ---- ----- ------ ----- service:httpd_w11 w11.local started service:httpd_w21 w21.local started So, depend on the node the state of cluster is different. The problems are: 1. after fencing nodes w11,w12,w13 the qourum is dissolved 2. services that should run on left working nodes are going down. 3. after bringing up fenced nodes the rgmanager has different view of services on each node. I can't always reproduce this bug. Sometimes everything goes ok but it happens quit rarely. Cheers Agnieszka Kukalowicz -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster