On 13/10/14 03:10 PM, Thomas Meier wrote:
Hi
When configuring PDU fencing in my 2-node-cluster I ran into some problems with
the fence_apc_snmp agent. Turning a node off works fine, but
fence_apc_snmp then exits with error.
When I do this manually (from node2):
fence_apc_snmp -a node1 -n 1 -o off
the output of the command is not an expected:
Success: Powered OFF
but in my case:
Returned 2: Error in packet.
Reason: (genError) A general failure occured
Failed object: .1.3.6.1.4.1.318.1.1.4.4.2.1.3.21
When I check the PDU, the port is without power, so this part works.
But it seems that the fence agent can't read the status of the PDU
and then exits with error. The same seems to happen when fenced
is calling the agent. The agent also exits with an error and fencing can't succeed
and the cluster hangs.
From the logfile:
fenced[2100]: fence node1 dev 1.0 agent fence_apc_snmp result: error from agent
My Setup: - CentOS 6.5 with fence-agents-3.1.5-35.el6_5.4.x86_64 installed.
- APC AP8953 PDU with firmware 6.1
- 2-node-cluster based on https://alteeve.ca/w/AN!Cluster_Tutorial_2
- fencing agents in use: fence_ipmilan (working) and fence_apc_snmp
I did some recherche, and for me it looks like that my fence-agents package is too old for my APC firmware.
I've already found the fence-agents repo: https://git.fedorahosted.org/cgit/fence-agents.git/
Here https://git.fedorahosted.org/cgit/fence-agents.git/commit/?id=55ccdd79f530092af06eea5b4ce6a24bd82c0875
it says: "fence_apc_snmp: Add support for firmware 6.x"
I've managed to build fence-agents-4.0.11.tar.gz on a CentOS 6.5 test box, but my build
of fence_apc_snmp doesn't work.
It gives:
[root@box1]# fence_apc_snmp -v -a node1 -n 1 -o status
Traceback (most recent call last):
File "/usr/sbin/fence_apc_snmp", line 223, in <module>
main()
File "/usr/sbin/fence_apc_snmp", line 197, in main
options = check_input(device_opt, process_input(device_opt))
File "/usr/share/fence/fencing.py", line 705, in check_input
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stderr))
TypeError: __init__() got an unexpected keyword argument 'stream'
I'd really like to see if a patched fence_apc_snmp agent fixes my problem, and if so,
install the right version of fence_apc_snmp on the cluster without breaking things,
but I'm a bit clueless how to build me a working version.
Maybe you have some tips?
Thanks in advance
Thomas
Hi Marek et. al.,
This is a RHEL 6.5 install, so Kristoffer's comment about needing a
newer version of python is a bit of a concern. Has this been tested on
RHEL 6 with an APC with the 6.x firmware?
cheeps
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster