Re: Fencing issues with fence_apc_snmp (APC Firmware 6.x)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 10/13/2014 09:10 PM, Thomas Meier wrote:
Hi

When configuring PDU fencing in my 2-node-cluster I ran into some problems with
the fence_apc_snmp agent. Turning a node off works fine, but
fence_apc_snmp then exits with error.



When I do this manually (from node2):

    fence_apc_snmp -a node1 -n 1 -o off

the output of the command is not an expected:

    Success: Powered OFF

but in my case:

    Returned 2: Error in packet.
    Reason: (genError) A general failure occured
    Failed object: .1.3.6.1.4.1.318.1.1.4.4.2.1.3.21


When I check the PDU, the port is without power, so this part works.
But it seems that the fence agent can't read the status of the PDU
and then exits with error. The same seems to happen when fenced
is calling the agent. The agent also exits with an error and fencing can't succeed
and the cluster hangs.
Yes, this is known bug as APC in 6.x firmware has changed a table with information.

I've already found the fence-agents repo: https://git.fedorahosted.org/cgit/fence-agents.git/

Here https://git.fedorahosted.org/cgit/fence-agents.git/commit/?id=55ccdd79f530092af06eea5b4ce6a24bd82c0875
it says: "fence_apc_snmp: Add support for firmware 6.x"
yes, this should fix the issue

I've managed to build fence-agents-4.0.11.tar.gz on a CentOS 6.5 test box, but my build
of fence_apc_snmp doesn't work.

It gives:

[root@box1]# fence_apc_snmp -v -a node1 -n 1 -o status
Traceback (most recent call last):
   File "/usr/sbin/fence_apc_snmp", line 223, in <module>
     main()
   File "/usr/sbin/fence_apc_snmp", line 197, in main
     options = check_input(device_opt, process_input(device_opt))
   File "/usr/share/fence/fencing.py", line 705, in check_input
     logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stderr))
TypeError: __init__() got an unexpected keyword argument 'stream'
Feel free to remove logging if it does not work. The other option is to just take a patch from git and backport it. There should be no big differences (I expect only very minor changes).

I'd really like to see if a patched fence_apc_snmp agent fixes my problem, and if so,
install the right version of fence_apc_snmp on the cluster without breaking things,
but I'm a bit clueless how to build me a working version.

Sure, there will be a new official release for RHEL 6.7 (as 6.6 was released few days ago). So until that time only upstream or patches.

m,

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster




[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux