The pithy ruminations from "Fabio M. Di Nitto" <fdinitto@xxxxxxxxxx> on "Re: quorum device not getting a vote causes 2-node cluster to be inquorate" were: => On 03/15/2011 05:11 AM, bergman@xxxxxxxxxxxx wrote: => > I have been using a 2-node cluster with a quorum disk successfully for => > about 2 years. Beginning today, the cluster will not boot correctly. => > => > The RHCS services start, but fencing fails with: => > => > dlm: no local IP address has been set => > dlm: cannot start dlm lowcomms -107 => > => > This seems to be a symtpom of the fact that the cluster votes do not include votes from the quorum => > device: => > => > # clustat => > Cluster Status for example-infra @ Tue Mar 15 00:02:35 2011 => > Member Status: Inquorate => > => > Member Name ID Status => > ------ ---- ---- ------ => > example-infr2-admin.domain.com 1 Online, Local => > example-infr1-admin.domain.com 2 Offline => > /dev/mpath/quorum 0 Offline => > => > [root@example-infr2 ~]# cman_tool status => > Version: 6.2.0 => > Config Version: 239 => > Cluster Name: example-infra => > Cluster Id: 42813 => > Cluster Member: Yes => > Cluster Generation: 676844 => > Membership state: Cluster-Member => > Nodes: 1 => > Expected votes: 2 => > Total votes: 1 => > Quorum: 2 Activity blocked => > Active subsystems: 7 => > Flags: => > Ports Bound: 0 => > Node name: example-infr2-admin.domain.com => > Node ID: 1 => > Multicast addresses: 239.192.167.228 => > Node addresses: 192.168.110.3 => => You should check the output from cman_tool nodes. It appears that the => nodes are not seeing each other at all. That's correct...at the time I ran cman_tool and clustat, one node was down (deliberately, in an attempt to troubleshoot the issue, but this would also be the case in the event of a hardware failure). As I see it, the problem is not with the inter-node communication, but with the quorum device. Note that there is only one vote registered--there are no votes from the quorum device. The quorum device should provide sufficient votes to make the "cluster" quorate if only one node is running. If I understand it correctly, this should also let the "cluster" start with a single node (as long as that node can write to the quorum device). If my understanding is wrong, then how can a 2-node cluster start if one node is down? => => The first things I would check are iptables, node names resolves to the => correct ip addresses, selinux and eventually if the switch in between => the nodes support multicast. SElinux is disabled (as it has been for the 2 years this cluster has been operational). There have been no switch changes. Node names & IPs resolve correctly. IPtables permits all communication between the "admin" address on the servers. => => Fabio => => -- => Linux-cluster mailing list => Linux-cluster@xxxxxxxxxx => https://www.redhat.com/mailman/listinfo/linux-cluster => Thanks, Mark -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster