Re: clurgmgrd - <err> #48: Unable to obtain cluster lock: Connectiontimed out

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 07, 2007 at 01:54:56PM -0400, rhurst@xxxxxxxxxxxxxxxxx wrote:
> What could cause clurgmgrd fail like this?  If clurgmgrd has a hiccup
> like this, is it supposed to shutdown its services?  Is there something
> in our implementation that could have prevented this from shutting down?
> 
> For unexplained reasons, we just had our CS service (WATSON) go down on
> its own, and the syslog entry details the event as:
> 
> May  7 13:18:39 db1 clurgmgrd[17888]: <err> #48: Unable to obtain
> cluster lock: Connection timed out 
> May  7 13:18:41 db1 kernel: dlm: Magma: reply from 2 no lock
> May  7 13:18:41 db1 kernel: dlm: reply
> May  7 13:18:41 db1 kernel: rh_cmd 5
> May  7 13:18:41 db1 kernel: rh_lkid 200242
> May  7 13:18:41 db1 kernel: lockstate 2
> May  7 13:18:41 db1 kernel: nodeid 0
> May  7 13:18:41 db1 kernel: status 0
> May  7 13:18:41 db1 kernel: lkid ee0388
> May  7 13:18:41 db1 clurgmgrd[17888]: <notice> Stopping service WATSON 

This usually is a dlm bug.  Once the DLM gets in to this state,
rgmanager blows up.  What rgmanager are you using?

(There's only one lock per service; the complexity of the service
doesn't matter...)

-- 
Lon Hohberger - Software Engineer - Red Hat, Inc.

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux