RE: some questions about rgmanager

"Martin Waite" <Martin.Waite@xxxxxxxxxxxx> · Mon, 26 Oct 2009 11:05:18 -0000

Hi Brem,

Thanks for the pointers.  

The link to "OCF RA API Draft" appears to answer my questions.  It will take a while to digest all that.

I think you had a typo - "clusvcadm -D myfailedservice"  should be "clusvcadm -d myfailedservice".

My service (mysql) was failing because "shutdown_wait" was too low, causing stops and restarts to fail.  

Sure enough, your suggestion works:

  sudo /usr/sbin/clusvcadm -d mysql_service 

  (fix config)

  sudo /usr/sbin/clusvcadm -e mysql_service

And I suppose that if the service is in a mess on its current node - eg. software error prevents shutdown - 
then I would disable and then relocate the service:

  sudo /usr/sbin/clusvcadm -d mysql_service 
  sudo /usr/sbin/clusvcadm -e mysql_service -m othernode

regards,
Martin

-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of brem belguebli
Sent: 23 October 2009 19:21
To: linux clustering
Subject: Re:  some questions about rgmanager

2009/10/23 Martin Waite <Martin.Waite@xxxxxxxxxxxx>:
> Hi,
>
> Are there any guidelines about how to write resource scripts that will
> be run by rgmanager /clurgmgrd ?
>
> I have been tracing execution through rg_test, but I don't know how
> representative this is.  For example, performing a service check through
> rg_test calls just about every script in /usr/share/cluster with the
> "meta-data" command, then calling service.sh with command "status", and
> finally the resource script with the command "status".   Is this what
> will happen when clurgmgrd starts or stops a service ?
>
> Is there a specification covering the environment variables supplied to
> the resource scripts - eg. OCF_RESOURCE_INSTANCE ?

Usefull info can be found at http://sources.redhat.com/cluster/wiki/RGManager
>
> Are the actions of the various scripts documented or specified somewhere
> ?   Do they tend to change across releases ?
>
> Is there a standard way of extending the monitoring performed by the
> scripts, or do I just edit the supplied scripts to suit ?
>
> During experiments in configuring a service, the cluster often reached a
> state where clustat reports a service as "failed".  What is the best way
> of recovering from this state ?  I cannot see that clusvcadm can be used
> to recover from this state, and so far the only path to recovery appears
> to be to restart rgmanager on all cluster nodes.
>
>From my experience, no need from restarting rgmanager, just disable
the failed service (clusvcadm -D myfailedservice,), find out/fix what
caused the service to fail (in general scripting errors), restart the
service (clusvcadm -e myfailedservice)
> Thanks in advance for any pointers on this.
>
> -- Martin
>
>
> --
> Linux-cluster mailing list
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster
>

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster