Yes, something is wrong. I made little bit more research: - stop cluster daemons on both nodes (node a & node b) - start cluster on node a It hangs 5 minutes on cman's fencing part like this: Starting cluster: Loading modules... done Mounting configfs... done Starting ccsd... done Starting cman... done Starting daemons... done Starting fencing... ... and in process list there is this: /sbin/fence_tool -w -t 300 join ... so thats the 5 minutes. Question is: why it waits there 54 minutes? - after 5 minutes waiting, node a says: Starting fencing... failed [FAILED] Starting the Quorum Disk Daemon: [ OK ] Starting Cluster Service Manager: [ OK ] ... and then it loads qdiskd and after a while it has 2 votes and it starts services normally and voila, I have a running cluster with one node: Node Sts Inc Joined Name 0 M 0 2008-04-17 12:51:01 /dev/sda 1 M 1356 2008-04-17 12:45:44 areenasql1 2 X 0 areenasql2 [root@areenasql1 ~]# cman_tool status Version: 6.0.1 Config Version: 4 Cluster Name: areena_sql Cluster Id: 39330 Cluster Member: Yes Cluster Generation: 1356 Membership state: Cluster-Member Nodes: 1 Expected votes: 3 Total votes: 2 Quorum: 2 Active subsystems: 8 Flags: Ports Bound: 0 177 Node name: areenasql1 Node ID: 1 Multicast addresses: 239.192.153.60 Node addresses: 10.1.1.178 But log says nothing about that failed fencing. Fencing is configured correctly, I use HP ILO and everything is ok. Fencing works in running cluster ok, both nodes can fence eachother. Node a should fence node b in this situation and maby it's trying to do it somehow, but it logs nothing. It should log at least "fence failed etc." if it's unable to fence node b... And what's more important, if we think node a can't fence node b in this startup situation, it should NOT start services but it starts.... -hjp On Thu, 2008-04-17 at 11:32 +0200, jr wrote: > Am Donnerstag, den 17.04.2008, 12:28 +0300 schrieb Harri Päiväniemi: > > Well, > > > > I don't have any mistakes with firewalls, hosts, names, ip's etc. This > > is a fact. Communication itself works. Maby it sounds strange when I say > > I don't have mistakes, but this time it's true ;) > > > > In this case cluster should gain quorum and start running services on > > node a (it has 2 votes (node-vote + qdisk-vote). > > > > It should fence node b first, because it doesn't know where it is. > > > > So this behaviour is wrong. > > > > -hjp > > i think something is wrong here, like the expected votes or similiar. if > the one node had 2 votes and those were the expected votes, it would > maintain quorum and thus fence the other node. that connection refused > error seems to say that that node doesn't have the quorum nonetheless. > can you confirm that? (clustat should show you if that node is quorate > or not) > regards, > johannes > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster