Re: restart or relocate?

Robert Peterson <rpeterso@xxxxxxxxxx> · Wed, 29 Nov 2006 12:39:35 -0600

Carlo Mandelli wrote:
Hi all,

I'm trying to test a 2 nodes cluster (RHCS U4) with apache and one
monitored ip on eth1 (VIP 192.168.0.3), the hearthbeat is on eth0.

When I unplug the cable (eth1) on active node, I get these errors:

Nov 29 17:03:54 node1 clurgmgrd: [4368]: <info> Executing
/etc/init.d/httpd status
Nov 29 17:04:24 node1 clurgmgrd: [4368]: <info> Executing
/etc/init.d/httpd status
Nov 29 17:04:25 node1 kernel: tg3: eth1: Link is down.
Nov 29 17:04:44 node1 clurgmgrd: [4368]: <warning> Link for eth1: Not
detected
Nov 29 17:04:44 node1 clurgmgrd: [4368]: <warning> No link on eth1...
Nov 29 17:04:44 node1 clurgmgrd[4368]: <notice> status on ip
"192.168.0.3" returned 1 (generic error)
Nov 29 17:04:44 node1 clurgmgrd[4368]: <notice> Stopping service http
Nov 29 17:04:44 node1 clurgmgrd: [4368]: <info> Executing
/etc/init.d/httpd stop
Nov 29 17:04:44 node1 httpd: httpd shutdown succeeded
Nov 29 17:04:44 node1 clurgmgrd: [4368]: <info> Removing IPv4 address
192.168.0.3 from eth1
Nov 29 17:04:54 node1 clurgmgrd[4368]: <notice> Service http is recovering
Nov 29 17:04:54 node1 clurgmgrd[4368]: <notice> Recovering failed
service http
Nov 29 17:04:54 node1 clurgmgrd: [4368]: <warning> Link for eth1: Not
detected
Nov 29 17:04:54 node1 clurgmgrd: [4368]: <info> Executing
/etc/init.d/httpd start
Nov 29 17:04:54 node1 httpd: httpd startup succeeded
Nov 29 17:04:54 node1 clurgmgrd[4368]: <notice> Service http started
Nov 29 17:05:04 node1 clurgmgrd: [4368]: <warning> 192.168.0.3 is not
configured
Nov 29 17:05:04 node1 clurgmgrd[4368]: <notice> status on ip
"192.168.0.3" returned 1 (generic error)
Nov 29 17:05:04 node1 clurgmgrd[4368]: <notice> Stopping service http
Nov 29 17:05:04 node1 clurgmgrd: [4368]: <info> Executing
/etc/init.d/httpd stop
Nov 29 17:05:04 node1 httpd: httpd shutdown succeeded
Nov 29 17:05:04 node1 clurgmgrd[4368]: <notice> Service http is recovering
Nov 29 17:05:04 node1 clurgmgrd[4368]: <notice> Recovering failed
service http
Nov 29 17:05:04 node1 clurgmgrd: [4368]: <warning> Link for eth1: Not
detected
Nov 29 17:05:04 node1 clurgmgrd: [4368]: <info> Executing
/etc/init.d/httpd start
Nov 29 17:05:04 node1 httpd: httpd startup succeeded
<...>

and it restarts the service continously.

It performs failover only if I modify recovery mode in cluster.conf:

<service autostart="1" name="http" recovery="relocate">

Is there any way to set max number of retries before relocate service?

Thanks
Carlo

Hi Carlo,

You're probably the victim of the init-script-not-returning-zero issue.  
See:
http://sources.redhat.com/cluster/faq.html#rgm_wontrestart

Regards,

Bob Peterson
Red Hat Cluster Suite

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster