Re: SNMP support with IBM Blade Center Fence Agent

Parvez Shaikh <parvez.h.shaikh@xxxxxxxxx> · Tue, 1 Mar 2011 18:50:18 +0530

Hi Ryan,

Thank you for response. Does it mean there is no way to intimate administrator about failure of fencing as of now?

Let me give more information about my cluster -

I have set of nodes in cluster with only IP resource being protected. I have two levels of fencing, first bladecenter fencing and second one is manual fencing.

At times if machine is already down(either power failure or turned off abrupty); blade center fencing timesout and manual fencing happens. At this time, administrator is expected to run fence_ack_manual.

Clearly this is not something which is desirable, as downtime of services is as long as administrator runs fence_ack_manual.

What is recommended method to deal with  blade center fencing failure in this situation? Do I have to add another level of fencing(between blade center and manual) which can fence automatically(not requiring manual interference)?

Thanks

On Mon, Feb 28, 2011 at 9:44 PM, Ryan O'Hara <rohara@xxxxxxxxxx> wrote:

On Mon, Feb 28, 2011 at 12:43:10PM +0530, Parvez Shaikh wrote:

> Hi all,

>

> I have a question related to fence agents and SNMP alarms.

>

> Fence Agent can fail to fence the failed node for various reason; e.g. with

> my bladecenter fencing agent, I sometimes get message saying bladecenter

> fencing failed because of timeout or fence device IP address/user

> credentials are incorrect.

>

> In such a situation is it possible to generate SNMP trap?

This feature will be in RHEL6.1. There is a new project called

'foghorn' that creates SNMPv2 traps from dbus signals.

git://git.fedorahosted.org/foghorn.git

In RHEL6.1 (and the latest upstream release), certain cluster

components will emit dbus signals when certain events occurs. This

includes fencing. So when a node is fenced a dbus signal is generated

by fenced. The foghorn service catches this signal and generated

SNMPv2 trap.

Note that foghorn runs as an AgentX subagent, so snmpd must be running

as the master agentx.

Ryan

> My cluster config file looks like below and in my case if bladecenter

> fencing fails, manual fencing kicks in and requires user to do

> fence_ack_manual, for this user must at least be notified via SNMP (or any

> other mechanism?) to intervene  -

>

>   <clusternodes>

>     <clusternode name="blade2" nodeid="2" votes="1">

>       <fence>

>         <method name="1">

>           <device blade="2" name="BladeCenterFencing"/>

>         </method>

>         <method name="2">

>           <device name="ManualFencing" nodename="blade2"/>

>         </method>

>       </fence>

>     </clusternode>

>     <clusternode name="blade1" nodeid="1" votes="1">

>       <fence>

>         <method name="1">

>           <device blade="1" name="BladeCenterFencing"/>

>         </method>

>         <method name="2">

>           <device name="ManualFencing" nodename="blade1"/>

>         </method>

>       </fence>

>     </clusternode>

>   </clusternodes>

>   <cman expected_votes="1" two_node="1"/>

>   <fencedevices>

>     <fencedevice agent="fence_bladecenter" ipaddr="blade-mm.com"

> login="USERID" name="BladeCenterFencing" passwd="PASSW0RD"/>

>     <fencedevice agent="fence_manual" name="ManualFencing"/>

>   </fencedevices>

>

> Thanks,

> Parvez

> --

> Linux-cluster mailing list

> Linux-cluster@xxxxxxxxxx

> https://www.redhat.com/mailman/listinfo/linux-cluster

--

Linux-cluster mailing list

Linux-cluster@xxxxxxxxxx

https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster