>>>>We use a 2 nodes cluster to manage failover services via dedicated scripts. >>>>Using clusvcadm -r <service_name> to migrate a service from one node >>>>to the other, it happens from time to time that the CS4 is stuck with >>>>"service_name stopping" diagnostic. >>Could you let us know: >> >>- architecture Two nodes connected to the backbone through eth0 and by a direct connection between them through eth1. Hostsname is set on eth0 which is also used as fencing interface. Heart-beat is also configured on eth0. >>- dlm-kernel package version : dlm-kernel.2.6.9-37.7.b.3 >>- rgmanager version : rgmanager.1.9.38-0.b.5 >>- service XML structure : what do you mean ? cluster.conf file ? >>- if possible, the service script itself (though this is the least >>likely problem) >>If you can, install the corresponding -debuginfo packages so we can get >>a backtrace of the rgmanager daemon. >> >> Will do that. At present, the dead-lock does not occur systematically, however it is frequent. It can take a while for us to reproduce the problem with debug packages. >>>>The stop target of the script associated with the service is not called. >>Subsequent >>>>clusvcadm -d <service_name> calls return a success diagnostic but do >>>>effectively strictly nothing : the service script is not called. >>There's a segfault (which is fixed in RHCS4U3 beta and CVS) which might >>explain the behavior. >>-- Lon -- mailto:Alain.Moulle@xxxxxxxx +------------------------------+--------------------------------+ | Alain Moullé | from France : 04 76 29 75 99 | | | FAX number : 04 76 29 72 49 | | Bull SA | | | 1, Rue de Provence | Adr : FREC B1-041 | | B.P. 208 | | | 38432 Echirolles - CEDEX | Email: Alain.Moulle@xxxxxxxx | | France | BCOM : 229 7599 | +-------------------------------+-------------------------------+ -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster