On 01/27/2012 04:03 AM, Parvez Shaikh wrote:
Hi guys, I am using Red Hat Cluster Suite which comes with RHEL 5.5 - cman_tool version >>6.2.0 config xxx Now I have a script resource in which I return $OCF_ERR_CONFIGURED; in case of a Fatal irrecoverable error, hoping that my service would not start on another cluster node. But I see that cluster, relocates it to another cluster node and attempts to start it. I referred error code documentation from http://www.linux-ha.org/doc/dev-guides/_return_codes.html Is there any return code which makes RHCS to give up on recovering service?
The resource must fail during the 'stop' phase if you want rgmanager to not try to recover it. There is no 'start' phase error condition that tells rgmanager to give up.
The history: If you don't have a program installed or configured on host1 but try to enable a service there, it will obviously fail to start (rightfully so). However, host2 may have the configuration. So, rgmanager will then stop the service and try to start it on host2. In fact, it will systematically try every host in the cluster until:
- the service starts successfully - no more hosts are available (e.g. restricted failover domain, exclusive services, or simply all hosts were tried). At this point, the service is placed in the 'stopped' state in the hopes that the next host to come online will be able to start the service - a failure during 'stop' occurs. Most errors during the stop phase will trigger an abortion of the enable request (except 'OCF_NOT_INSTALLED' when a <script> is missing) -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster