Re: $OCF_ERR_CONFIGURED - recovers service on another cluster node

Lon Hohberger <lhh@xxxxxxxxxx> · Wed, 08 Feb 2012 09:36:26 -0500

On 01/27/2012 04:03 AM, Parvez Shaikh wrote:
Hi guys,

I am using Red Hat Cluster Suite which comes with RHEL 5.5 -

cman_tool version
 >>6.2.0 config xxx

Now I have a script resource in which I return $OCF_ERR_CONFIGURED; in
case of a Fatal irrecoverable error, hoping that my service would not
start on another cluster node.

But I see that cluster, relocates it to another cluster node and
attempts to start it.

I referred error code documentation from
http://www.linux-ha.org/doc/dev-guides/_return_codes.html

Is there any return code which makes RHCS to give up on recovering service?

The resource must fail during the 'stop' phase if you want rgmanager to 
not try to recover it.  There is no 'start' phase error condition that 
tells rgmanager to give up.

The history:  If you don't have a program installed or configured on 
host1 but try to enable a service there, it will obviously fail to start 
(rightfully so).  However, host2 may have the configuration.  So, 
rgmanager will then stop the service and try to start it on host2.  In 
fact, it will systematically try every host in the cluster until:

  - the service starts successfully

  - no more hosts are available (e.g. restricted failover domain,
    exclusive services, or simply all hosts were tried).  At this
    point, the service is placed in the 'stopped' state in
    the hopes that the next host to come online will be able to
    start the service

  - a failure during 'stop' occurs.  Most errors during the stop
    phase will trigger an abortion of the enable request (except
    'OCF_NOT_INSTALLED' when a <script> is missing)

-- Lon

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster