Re: RHCS4 rgmanager/clurmgrd problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greg: Your explanation clarified for me what's happening and what needs
to be done. Thanks much.

On Thu, 2006-03-16 at 15:26 -0500, Greg Forte wrote:
> this has been covered, previously, but in brief:
> 
> a) the cluster services try to stop a service before starting it when 
> you enable it
> b) it expects the "/etc/init.d/service stop" command to return 0, 
> indicating that there was no problem
> c) many of the stock service scripts return non-zero if you try to stop 
> them when they're not running
> 
> depending on your point of view, (c) is the "correct" behavior or not; 
> in the case of cluster services, it's obviously not.  For the purposes 
> of cluster services, the script should only return non-zero on the 
> 'stop' command if the service was, in fact, running, and the script 
> failed to stop it.  A better solution than simply returning 0 
> braindeadly would be to check the output of the script's 'status' 
> command, and only attempt the stop if it's actually running, then return 
> non-zero if the stop fails, 0 (success) if the stop succeeds OR it 
> wasn't running in the first place.  But that's a lot of work.  ;-)
> 
> -g
> 
> Philip R. Dana wrote:
> > I found a work around. Like the gentleman with the mysql service problem
> > a while back, I edited /etc/init.d/named on both nodes such than named
> > stop returns 0, even though named is already stopped. I'm not smart
> > enough, yet, to figure out why that works, but it does.
> > 
> > On Thu, 2006-03-16 at 07:22 -0800, Philip R. Dana wrote:
> >> We have a two node active/passive cluster running bind as our master DNS
> >> server. Shared storage is iSCSI on a NetApp Filer. The OS is CentOS 4.2.
> >> Whenever the rgmanager service on the passive node is started/restarted,
> >> the service resource on the active node fails in that named itself is
> >> shut down. The only way to recover, as near as I can tell, is to set
> >> autostart=0 in cluster.conf, reboot both nodes, then manually start the
> >> service on one of the nodes. Is this by design, or an "undocumented
> >> feature"?
> >> Any help will be greatly appreciated. TIA.
> >>
> >> --
> >> 
> >> Linux-cluster@xxxxxxxxxx
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> > --
> > 
> > Linux-cluster@xxxxxxxxxx
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> 
> 

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux