Not restarting "max_restart" times before relocating failed service

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi experts,

I have defined a service as follows in cluster.conf -

                <service autostart="0" domain="mydomain" exclusive="0" max_restarts="5" name="mgmt" recovery="restart">
                        <script ref="myHaAgent"/>
                        <ip ref="192.168.51.51"/>
                </service>

I mentioned max_restarts=5 hoping that if cluster fails to start service 5 times, then it will relocate to another cluster node in failover domain.

To check this, I turned down NIC hosting service's floating IP and got following logs -

Oct 30 14:11:49 XXXX clurgmgrd: [10753]: <warning> Link for eth1: Not detected
Oct 30 14:11:49 XXXX clurgmgrd: [10753]: <warning> No link on eth1...
Oct 30 14:11:49 XXXX clurgmgrd: [10753]: <warning> No link on eth1...
Oct 30 14:11:49 XXXX clurgmgrd[10753]: <notice> status on ip "192.168.51.51" returned 1 (generic error)
Oct 30 14:11:49 XXXX clurgmgrd[10753]: <notice> Stopping service service:mgmt
Oct 30 14:12:00 XXXX clurgmgrd[10753]: <notice> Service service:mgmt is recovering
Oct 30 14:12:00 XXXX clurgmgrd[10753]: <notice> Recovering failed service service:mgmt
Oct 30 14:12:00 XXXX clurgmgrd[10753]: <notice> start on ip "192.168.51.51" returned 1 (generic error)
Oct 30 14:12:00 XXXX clurgmgrd[10753]: <warning> #68: Failed to start service:mgmt; return value: 1
Oct 30 14:12:00 XXXX clurgmgrd[10753]: <notice> Stopping service service:mgmt
Oct 30 14:12:00 XXXX clurgmgrd[10753]: <notice> Service service:mgmt is recovering
Oct 30 14:12:00 XXXX clurgmgrd[10753]: <warning> #71: Relocating failed service service:mgmt

Oct 30 14:12:01 XXXX clurgmgrd[10753]: <notice> Service service:mgmt is stopped
Oct 30 14:12:01 XXXX clurgmgrd[10753]: <notice> Service service:mgmt is stopped

But from the log it appears that cluster tried to restart service only ONCE before relocating.

I was expecting cluster to retry starting this service five times on the same node before relocating

Can anybody correct my understanding?

Thanks,
Parvez
-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux