RE: Problem with fenced on cluster with 2 BladeCentermachines: 1st machine is remove physically. The remaining one doesnot became Active (waiting for fenced)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am having the same issue. If a blade is not present (i.e. removed for
maintenance), the fence_bladecenter cannot check the state as it is
reported empty. I think it is something simple to fix for those versed
in perl. Normally the fence only runs against a blade that is present.
If the blade is removed while running, you run into this issue.

My case below. Blade #3 is a good node. Blade #2 was removed. The fence
does not work with the blade removed.

system> env -T system:blade[3]
OK
system:blade[3]> power -state
On
system:blade[3]> env -T system:blade[2]
The target bay is empty. 
system:blade[3]> env -T system:blade[1]
OK
system:blade[1]>

-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of James Parsons
Sent: Thursday, July 12, 2007 12:33 PM
To: linux clustering
Subject: Re:  Problem with fenced on cluster with 2
BladeCentermachines: 1st machine is remove physically. The remaining one
doesnot became Active (waiting for fenced)

catalin.lupescu@xxxxxxxx wrote:

>
> Hello!
>
> I have a Cluster Redhat made with 2 nodes IBM blades on Blade Center 
> chassis.
> (fenced version 1.32.6)
>
> I have done the following test:
> I have removed physically the node 1 machine (the Active one).
> The second one is never became active one. "Clustat" command does not 
> printing any information.
> In /var/log/messages we can found the following messages (repeated):
>
> Jul 11 17:46:24 cdrc1-2 fenced[4214]: fencing node "cdrc1-1"
> Jul 11 17:46:38 cdrc1-2 fenced[4214]: agent "fence_bladecenter" 
> reports: pattern match timed-out at /sbin/fence_bladecenter line 185 
> Jul 11 17:46:38 cdrc1-2 fenced[4214]: fence "cdrc1-1" failed
>
> If the node 1 is plugged, the node 2 became Active (fenced OK)
>
bz#240509 changed the sleep timeout in the bladecenter agent from 5 to
10...this is on or about line 193 in /sbin/fence_bladecenter.  See what
yours is set at, and try pushing it out a bit. This minor change is
making its way through the distribution chain now.

-j

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux