On Sun, Jul 01, 2007 at 02:17:48PM +0300, Janne Peltonen wrote: > Hi! > > Sometimes, when I have cleanly shut down rgmanager on one node, and the > services have nicely migrated to other nodes, trying to start rgmanager > fails. Trying to access /dev/misc/dlm_rgmanager results in "No such > device". clurgmgrd concludes that locks are not working and exits. > (See strace output attached.) That's really strange - it's almost like the DLM isn't responding to the requests. The open of /dev/misc/dlm_rgmanager is performed by libdlm; rgmanager is simply opening it. If I am not mistaken, the previous open of /dev/misc/dlm-control followed by the write is basically saying "yeah, that device exists". However, the device node isn't there, so we go to open it and it fails. > Trying to stop cman fails: > > --clip-- > [jmmpelto@pcn1 ~]$ sudo service cman restart > Stopping cluster: > Stopping fencing... done > Stopping cman... failed > /usr/sbin/cman_tool: Error leaving cluster: Device or resource busy > [FAILED] If something happened to the dlm while rgmanager was trying to use it, I suspect there's a chance that it could keep something held (preventing it from shutting down). This sounds related to an open bugzilla where rgmanager is not cleaning up a lockspace in all cases. In a clean shutdown, rgmanager should always be cleaning up the lockspace. -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster