I made the change and I will try it today during our scheduled maintenance. Thanks, Eric From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of umesh susvirkar Try to set following in you cluster.conf file <cman expected_votes="3" quorum_dev_poll="35000" > <multicast addr="224.0.0.1" interface="eth0"/> </cman> --- cal for quorum_dev_poll > (interval * tko ) as per below 5*6 = 30 so 35 <quorumd interval="5" label="delta_qdisk" min_score="1" tko="6" votes="1"> <heuristic interval="5" program="ping -t1 -c1 192.168.1.1" score="1"/> </quorumd> for more info read following doc On Sat, Jul 24, 2010 at 3:50 AM, Eric Schneider <eschneid@xxxxxxxx> wrote: I have a few 2 node clusters and I notice that recently the clusters lose quorum when I reboot the node without running services. I could do this in the past without any problems. CentOS 5.5 on ESX 4.0 u1. Maybe a bug with a new kernel or cman software? I get the following right away when the node reboots: Jul 23 16:02:32 happy5 clurgmgrd[4269]: <notice> Member 2 shutting down Jul 23 16:02:52 happy5 qdiskd[3562]: <info> Node 2 shutdown Jul 23 16:03:02 happy5 qdiskd[3562]: <info> Assuming master role Jul 23 16:03:03 happy5 clurgmgrd[4269]: <emerg> #1: Quorum Dissolved Jul 23 16:03:03 happy5 openais[3533]: [CMAN ] lost contact with quorum device Jul 23 16:03:03 happy5 openais[3533]: [CMAN ] quorum lost, blocking activity Jul 23 16:03:03 happy5 ccsd[3493]: Cluster is not quorate. Refusing connection. Jul 23 16:03:03 happy5 ccsd[3493]: Error while processing connect: Connection refused Jul 23 16:03:03 happy5 ccsd[3493]: Cluster is not quorate. Refusing connection. Jul 23 16:03:03 happy5 ccsd[3493]: Error while processing connect: Connection refused Jul 23 16:03:03 happy5 ccsd[3493]: Invalid descriptor specified (-111). Jul 23 16:03:03 happy5 ccsd[3493]: Someone may be attempting something evil. Jul 23 16:03:03 happy5 ccsd[3493]: Error while processing get: Invalid request descriptor Jul 23 16:03:03 happy5 ccsd[3493]: Invalid descriptor specified (-111). Jul 23 16:03:03 happy5 ccsd[3493]: Someone may be attempting something evil. Jul 23 16:03:03 happy5 ccsd[3493]: Error while processing get: Invalid request descriptor <?xml version="1.0"?> <cluster alias="delta_cluster" config_version="40" name="delta_cluster"> <fence_daemon post_fail_delay="5" post_join_delay="120"/> <quorumd interval="5" label="delta_qdisk" min_score="1" tko="6" votes="1"> <heuristic interval="5" program="ping -t1 -c1 192.168.1.1" score="1"/> </quorumd> <clusternodes> <clusternode name="node1" nodeid="1" votes="1"> <fence> <method name="1"> <device name="node1"/> </method> </fence> </clusternode> <clusternode name="node2" nodeid="2" votes="1"> <fence> <method name="1"> <device name="node2"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="3"> <multicast addr="224.0.0.1" interface="eth0"/> </cman> <fencedevices> <fencedevice agent="fence_manual" name="fence_manual"/> <fencedevice agent="fence_vmware" ipaddr="bob" login="username" name="node1" passwd="password" port="node1"/> <fencedevice agent="fence_vmware" ipaddr="bob" login="username" name="node2" passwd="password" port="node2"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="node1" ordered="0" restricted="1"> <failoverdomainnode name="node1" priority="1"/> </failoverdomain> <failoverdomain name="node2" restricted="1"> <failoverdomainnode name="node2" priority="1"/> </failoverdomain> <failoverdomain name="failover_pro-http" restricted="0"> <failoverdomainnode name="node1" priority="1"/> <failoverdomainnode name="node2" priority="1"/> </failoverdomain> </failoverdomains> </rm> <totem token="21000"/> </cluster> Thanks, Eric
|
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster