Your nodes don't seem to be able to communicate: Oct 30 16:08:15 rhel-cluster-node2 fenced[3549]: rhel-cluster-node1.mgmt.local not a cluster member after 3 sec post_join_delay Oct 30 16:08:15 rhel-cluster-node2 fenced[3549]: fencing node "rhel-cluster-node1.mgmt.local" Oct 30 16:08:29 rhel-cluster-node2 fenced[3549]: fence "rhel-cluster-node1.mgmt.local" success I never see them form a cluster: Oct 30 16:03:25 rhel-cluster-node2 openais[3511]: [CLM ] CLM CONFIGURATION CHANGE Oct 30 16:03:25 rhel-cluster-node2 openais[3511]: [CLM ] New Configuration: Oct 30 16:03:25 rhel-cluster-node2 openais[3511]: [CLM ] r(0) ip(10.4.1.102) Oct 30 16:03:25 rhel-cluster-node2 openais[3511]: [CLM ] Members Left: Oct 30 16:03:25 rhel-cluster-node2 openais[3511]: [CLM ] Members Joined: Oct 30 16:03:25 rhel-cluster-node2 openais[3511]: [CLM ] CLM CONFIGURATION CHANGE Oct 30 16:03:25 rhel-cluster-node2 openais[3511]: [CLM ] New Configuration: Oct 30 16:03:26 rhel-cluster-node2 openais[3511]: [CLM ] r(0) ip(10.4.1.102) Oct 30 16:03:26 rhel-cluster-node2 openais[3511]: [CLM ] Members Left: Oct 30 16:03:26 rhel-cluster-node2 openais[3511]: [CLM ] Members Joined: Oct 30 16:03:26 rhel-cluster-node2 openais[3511]: [SYNC ] This node is within the primary component and will provide service. Oct 30 16:03:26 rhel-cluster-node2 openais[3511]: [TOTEM] entering OPERATIONAL state. Are the nodes just rebooting each other in a cycle? If so my guess is that you are having issues routing the multicast traffic. An easy test is to try using broadcast. Change your cman tag to say: <cman expected_votes="1" two_node="1" broadcast="yes"/> If your nodes can form a cluster with that set then you need to evaluate your multicast config. -Ben ----- "Wahyu Darmawan" <wahyu@xxxxxxxxxxxxxx> wrote: > Hi all, > > Thanks. Iâve replaced mainboard on both servers. But thereâs another > problem. Both servers active after mainboard replaced. > > > > But, when I restart the node that is active, other node will be > restarted as well. This happened during fencing. > > Repeated occurrence, which would in turn lead to both restart > repeatedly. > > > > Need your suggestion please.. > > Please find the attachment of /var/log/messages/ > > And, hereâs my cluster.conf > > <?xml version="1.0"?> > <cluster alias="PORTAL_WORLD" config_version="32" name="PORTAL_WORLD"> > <fence_daemon clean_start="0" post_fail_delay="0" > post_join_delay="3"/> > <clusternodes> > <clusternode name="rhel-cluster-node1.mgmt.local" nodeid="1" > votes="1"> > <fence> > <method name="1"> > <device name="NODE1-ILO"/> > </method> > </fence> > </clusternode> > <clusternode name="rhel-cluster-node2.mgmt.local" nodeid="2" > votes="1"> > <fence> > <method name="1"> > <device name="NODE2-ILO"/> > </method> > </fence> > </clusternode> > </clusternodes> > <quorumd device="/dev/sdf1" interval="3" label="quorum_disk1" tko="23" > votes="2"> > <heuristic interval="2" program="ping 10.4.0.1 -c1 -t1" score="1"/> > </quorumd> > <cman expected_votes="1" two_node="1"/> > <fencedevices> > <fencedevice agent="fence_ilo" hostname="ilo-node2" > login="Administrator" name="NODE2-ILO" passwd="password"/> > <fencedevice agent="fence_ilo" hostname="ilo-node1" > login="Administrator" name="NODE1-ILO" passwd="password"/> > </fencedevices> > <rm> > <failoverdomains> > <failoverdomain name="Failover" nofailback="0" ordered="0" > restricted="0"> > <failoverdomainnode name="rhel-cluster-node2.mgmt.local" > priority="1"/> > <failoverdomainnode name="rhel-cluster-node1.mgmt.local" > priority="1"/> > </failoverdomain> > </failoverdomains> > <resources> > <ip address="10.4.1.103" monitor_link="1"/> > </resources> > <service autostart="1" domain="Failover" exclusive="0" > name="IP_Virtual" recovery="relocate"> > <ip ref="10.4.1.103"/> > </service> > </rm> > </cluster> > > > > Thanks, > > > > > > > > From: linux-cluster-bounces@xxxxxxxxxx > [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Dustin Henry > Offutt > Sent: Thursday, October 28, 2010 11:46 PM > To: linux clustering > Subject: Re: Fence Issue on BL 460C G6 > > > > I believe your problem is being caused by "nofailback" being set to > "1". : > > <failoverdomain name="Failover" nofailback="1" ordered="0" > restricted="0"> > > Set it to zero and I believe your problem will be resolved. > > > On Wed, Oct 27, 2010 at 10:43 PM, Wahyu Darmawan < > wahyu@xxxxxxxxxxxxxx > wrote: > > Hi Ben, > Here is my cluster.conf. Need your help please. > > > <?xml version="1.0"?> > <cluster alias="PORTAL_WORLD" config_version="32" name="PORTAL_WORLD"> > <fence_daemon clean_start="0" post_fail_delay="0" > post_join_delay="3"/> > <clusternodes> > <clusternode name="rhel-cluster-node1.mgmt.local" nodeid="1" > votes="1"> > <fence> > <method name="1"> > <device name="NODE1-ILO"/> > </method> > </fence> > </clusternode> > <clusternode name="rhel-cluster-node2.mgmt.local" nodeid="2" > votes="1"> > <fence> > <method name="1"> > <device name="NODE2-ILO"/> > </method> > </fence> > </clusternode> > </clusternodes> > <quorumd device="/dev/sdf1" interval="3" label="quorum_disk1" tko="23" > votes="2"> > <heuristic interval="2" program="ping 10.4.0.1 -c1 -t1" score="1"/> > </quorumd> > <cman expected_votes="1" two_node="1"/> > <fencedevices> > <fencedevice agent="fence_ilo" hostname="ilo-node2" > login="Administrator" name="NODE2-ILO" passwd="password"/> > <fencedevice agent="fence_ilo" hostname="ilo-node1" > login="Administrator" name="NODE1-ILO" passwd="password"/> > </fencedevices> > <rm> > <failoverdomains> > <failoverdomain name="Failover" nofailback="1" ordered="0" > restricted="0"> > <failoverdomainnode name="rhel-cluster-node2.mgmt.local" > priority="1"/> > <failoverdomainnode name="rhel-cluster-node1.mgmt.local" > priority="1"/> > </failoverdomain> > </failoverdomains> > <resources> > <ip address="10.4.1.103" monitor_link="1"/> > </resources> > <service autostart="1" domain="Failover" exclusive="0" > name="IP_Virtual" recovery="relocate"> > <ip ref="10.4.1.103"/> > </service> > </rm> > </cluster> > > Many thanks, > Wahyu > > > > > -----Original Message----- > From: linux-cluster-bounces@xxxxxxxxxx [mailto: > linux-cluster-bounces@xxxxxxxxxx ] On Behalf Of Ben Turner > Sent: Thursday, October 28, 2010 12:18 AM > To: linux clustering > Subject: Re: Fence Issue on BL 460C G6 > > My guess is there is a problem with fencing. Are you running fence_ilo > with an HP blade? Iirc the iLOs on the blades have a different CLI, I > don't think fence_ilo will work with them. What do you see in the > messages files during these events? If you see failed fence messages > you may want to look into using fence_ipmilan: > > http://sources.redhat.com/cluster/wiki/IPMI_FencingConfig > > If you post a snip of your messages file from this event and your > cluster.conf I will have a better idea of what is going on. > > -b > > > > ----- "Wahyu Darmawan" < wahyu@xxxxxxxxxxxxxx > wrote: > > > Hi all, > > > > > > > > For fencing, Iâm using HP iLO and server is BL460c G6. Problem is > > resource is start moving to the passive when the failed node is > power > > on. It is really strange for me. For example, I shutdown the node1 > and > > physically remove the node1 machine from the blade chassis and > monitor > > the clustat output, clustat was still showing that the resource is > on > > node 1, even node 1 is power down and removed from c7000 blade > > chassis. But when I plugged again the failed node1 on the c7000 > blade > > chassis and it power-on, then clustat is showing that the resource > is > > start moving to the passive node from the failed node. > > Iâm powering down the blade server with power button in front of it, > > then we remove it from the chassis, If we face the hardware problem > in > > our active node and the active node goes down then how the resource > > move to the passive node. In addition, When I rebooted or shutdown > the > > machine from the CLI, then the resource moves successfully from the > > passive node. Furthurmore, When I shutdown the active node with > > "shutdown -hy 0" command, after shuting down the active node > > automatically restart. > > > > Please help me. > > > > > > > > Many Thanks, > > -- > > Linux-cluster mailing list > > Linux-cluster@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster