How to integrate a custom resource agent into RHCS?

<Ralph.Grothe@xxxxxxxxxxxxxx> · Mon, 30 May 2011 14:28:34 +0200

Hi,

I hope this is the right forum. So bear with me Pacemaker
aficionados et alii when I talk about Red Hat Cluster Suite
(RHCS).
That's the clusterware product I am given to set up the cluster
and I'm not free to chose another software of my liking.

Though this may sound ridiculous, since days I've been labouring
to get a fairly simple custom resource agent (hence RA) to be
acknowledged by RHCS and correctly executed through its
rgmanager.

When scripting my RA I mostly adhered to
http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html apart
from where RHCS RAs differs from general OCF.

I put my RA in /usr/share/cluster and afterwards restarted
rgmanager on all nodes.

When I try to start the service whereof my RA's managed resource
is part of the service though gets started but not my resource,
as if it wasn't part of the service at all.

When I try to start my resource via rg_test nothing happens apart
from this obscure log entry

[root@aruba:~]
# rg_test test /etc/cluster/cluster.conf start aDIStn_sec
Running in test mode.
Entity: line 2: parser error : Char 0x0 out of allowed range

^
Entity: line 2: parser error : Premature end of data in tag error
line 1

^
[root@aruba:~]
# echo $?
0

[root@aruba:~]
# grep rg_test /var/log/cluster.log|tail -1
May 30 13:54:55 aruba rg_test: [28643]: <err> Cannot dump
meta-data because '/usr/share/cluster/default.metadata' is
missing 

Though this is true

[root@aruba:~]
# ls -l /usr/share/cluster/default.metadata
ls: /usr/share/cluster/default.metadata: No such file or
directory

there isn't such a file part of the installed clusterware at all
either

[root@aruba:~]
# yum groupinfo Clustering|tail -10|xargs rpm -ql|grep -c
default\\.metadata
0

And besides, I don't understand this error because since I wrote
my RA according to above mentioned RA Developer's Guide it of
course dumps its metadata

[root@aruba:~]
# /usr/share/cluster/aDIStn_sec.sh meta-data|grep action
    <actions>
        <action name="start" timeout="0"/>
        <action name="stop" timeout="0"/>
        <action name="status" timeout="5"/>
        <action name="monitor" timeout="5"/>
        <action name="meta-data" timeout="0"/>
        <action name="verify-all" timeout="5"/>
        <action name="validate-all" timeout="5"/>
    </actions>

(note, RHCS deviates from OCF here in naming its actions
verify-all instead of validate-all and status instead of monitor.
But both refer to the same case block in my RA)

I also don't understand the "Char 0x0 out of allowed range" error
from the XML parser.

If it really refers to line 2 of my cluster.conf this looks
pretty ok to me

[root@aruba:~]
# sed -n 2p /etc/cluster/cluster.conf
<cluster alias="rhcs_mock" config_version="43" name="rhcs_mock">

If I run a validity check of the XML of my cluster.conf against
RHCS's RNG schema I also get an incomprehensible error about
extra elements in interleave.

Nevertheless, all other resources of my cluster which rely on
RHCS's standard RAs are managed ok by the clusterware.

[root@aruba:~]
# declare -f cluconf_valid
cluconf_valid () 
{ 
    xmllint --noout --relaxng
/usr/share/system-config-cluster/misc/cluster.ng
${1:-/etc/cluster/cluster.conf}
}
[root@aruba:~]
# cluconf_valid 
Relax-NG validity error : Extra element cman in interleave
/etc/cluster/cluster.conf:2: element cluster: Relax-NG validity
error : Element cluster failed to validate content
/etc/cluster/cluster.conf fails to validate

Btw. is there a schema file available to check an RA's metadata
for validity?

Of course did I test my RA script for correct functionality when
used like an init script (to which end I provide the required
environment of OCF_RESKEY_parameter(s)),
and it starts, stops and monitors my resource as intended.

Can anyone help?

Regards
Ralph

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster