RE: RHEL5.0 Cluster fencing problems involving bonding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think I know what's going on ...

When I take down the two slave interfaces (eth2 & eth3) on Node C, the
bond1 interface remains UP. 
This means  that the Node C still thinks its OK, however it can not see
Node A & B, and tries to fence Node B.
Node A which is the master fences Node C.

I'm not sure how to resolve this any help would be appreciated.

D.

-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Patel Dino
Sent: Monday, June 02, 2008 8:36 AM
To: linux clustering
Subject: RE:  RHEL5.0 Cluster fencing problems involving
bonding


At the time Node A is the master.

I do have a quorum disk setup. When the two nodes (B & C) get fenced the
cluster stays up with Node A  & the quorum disk.


-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Maciej Bogucki
Sent: Monday, June 02, 2008 7:24 AM
To: linux clustering
Subject: Re:  RHEL5.0 Cluster fencing problems involving
bonding


doobs72 _ wrote:
>
> Hi
>
>  
>
>  I'm having fencing problems in my 3 node cluster running on 
> RHEL5.0 which involves bonding.
>
>  
>
> I have 3 severs A, B & C in a cluster with bonding configured on eth2 
> & eth3 for my cluster traffic.  The config is as below:
>
>  
>
> DEVICE=eth2
>
> BOOTPROTO=none
>
> ONBOOT=yes
>
> TYPE=Ethernet
>
> MASTER=bond1
>
> SLAVE=yes
>
> USRCTL=no
>
>  
>
> DEVICE=eth3
>
> BOOTPROTO=none
>
> ONBOOT=yes
>
> TYPE=Ethernet
>
> MASTER=bond1
>
> SLAVE=yes
>
> USRCTL=no
>
>  
>
>  
>
> DEVICE=bond1
>
> IPADDR=192.168.x.x
>
> NETMASK=255.255.255.0
>
> NETWORK=192.168.x.0
>
> BROADCAST=192.168.x.255
>
> ONBOOT=YES
>
> BOOTPROTO=none
>
>  
>
> The /etc/modprobe.conf file is configured as below:
>
>  
>
> alias eth0 bnx2
>
> alias eth1 bnx2
>
> alias eth2 e1000
>
> alias eth3 e1000
>
> alias eth4 e1000
>
> alias eth5 e1000
>
> alias scsi_hostadapter cciss
>
> alias bond0 bonding
>
> options bond0 miimon=100 mode=active-backup max_bonds=3
>
> alias bond1 bonding
>
> options bond1 miimon=100 mode=active-backup
>
> alias bond2 bonding
>
> options bond2 miimon=100 mode=active-backup
>
> alias scsi_hostadapter1 qla2xxx
>
> alias scsi_hostadapter2 usb-storage
>
>  
>
>  
>
> The cluster starts up OK, however when I try to test the bonded 
> interfaces my troubles begin.
>
> On Node C if I "ifdown bond1", the node C, is fenced and everything 
> works as expected.
>
>  
>
> However if on Node C, I take down the interfaces one at a time i.e. 
>
>  "ifdown  eth2", - the cluster stays up as expected using eth3 for 
> routing traffic  
>
>   "ifdown eth3" 
>
> then node C is fenced by Node A. However in the /var/log/messages file

> on Node C I see a message saying that Node B will be fenced. The 
> outcome is Nodes C & B are fenced.
>
>  
>
> My question is why does node B get fenced as well?
>
>
Hello,

First of all, You have the problem with bonding. Switch off the cluster,

and investigate why when You do "ifdown eth3" the cluster goes down. I 
suspect that the problem is with e1000 driver.
I suppose that C is the master of the cluster and it is faster than 
election of new master(of A,B).
You could identify the master by: i=`cman_tool services | grep -A 1 
default | tail -1 | sed -e 's/\[\(.\).*/\1/'`; cman_tool nodes | awk 
'{print $1,$5}' | grep "^$i"
To resolve this issue You need to use more than one communication medium

fe. ethernet or disk quorum if You have one?

Best Regards
Maciej Bogucki


--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
************************************************************************

DISCLAIMER 
The information contained in this e-mail is confidential and is intended

for the recipient only. 
If you have received it in error, please notify us immediately by reply
e-mail and then delete it from your system. Please do not copy it or use
it for any other purposes, or disclose the content of the e-mail to any
other person or store or copy the information in any medium. 
The views contained in this e-mail are those of the author and not
necessarily those of AAH Pharmaceuticals Ltd. 
AAH Pharmaceuticals Ltd is a company incorporated in England and Wales
under company number 123458 and whose registered office is at Sapphire
Court, Walsgrave Triangle, Coventry, CV2 2TX 
************************************************************************

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux