Hi Robert we also have a lot of this problems! First of all, I higly recommanded you to install a newer rgmanager, your version has really a lot of bugs. min. rgmanager-1.9.54-4.222484hf, but now is 4.5 released take this one. we also have the dlm problems ( BZ#206463 and BZ#199673)and the Support recommend us RHEL4 U5, that should fix this problem. --> install RHEL4 Update 5 and all your Problems should be fixed. regards Mike -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx on behalf of Robert Hurst Sent: Mon 14.05.2007 21:46 To: linux clustering Subject: RE: clurgmgrd - <err> #48: Unable to obtaincluster lock: Connectiontimed out Any new thoughts on this, is it a new bug, is it fixed with U5? I have a ticket open, but your insights on how probable this is a recurring bug would be helpful. Thanks. On Fri, 2007-05-11 at 19:54 -0400, rhurst@xxxxxxxxxxxxxxxxx wrote: > We are using RHEL 4 U4 with the GFS/CS that works for that release: > > $ rpm -q rgmanager dlm dlm-kernel magma magma-plugins > > rgmanager-1.9.54-1 > dlm-1.0.1-1 > dlm-kernel-2.6.9-44.9 > magma-1.0.6-0 > magma-plugins-1.0.9-0 > > Would the just-announced GFS/CS for U5 help any? Looks like a lof > issues were addressed. > > Robert Hurst, Sr. Caché Administrator > Beth Israel Deaconess Medical Center > 1135 Tremont Street, REN-7 > Boston, Massachusetts 02120-2140 > 617-754-8754 · Fax: 617-754-8730 · Cell: 401-787-3154 > Any technology distinguishable from magic is insufficiently advanced. > > > ______________________________________________________________________ > From: linux-cluster-bounces@xxxxxxxxxx on behalf of Lon Hohberger > Sent: Fri 5/11/2007 4:19 PM > To: linux clustering > Subject: Re: clurgmgrd - <err> #48: Unable to obtain > cluster lock: Connectiontimed out > > > On Mon, May 07, 2007 at 01:54:56PM -0400, rhurst@xxxxxxxxxxxxxxxxx > wrote: > > What could cause clurgmgrd fail like this? If clurgmgrd has a > hiccup > > like this, is it supposed to shutdown its services? Is there > something > > in our implementation that could have prevented this from shutting > down? > > > > For unexplained reasons, we just had our CS service (WATSON) go down > on > > its own, and the syslog entry details the event as: > > > > May 7 13:18:39 db1 clurgmgrd[17888]: <err> #48: Unable to obtain > > cluster lock: Connection timed out > > May 7 13:18:41 db1 kernel: dlm: Magma: reply from 2 no lock > > May 7 13:18:41 db1 kernel: dlm: reply > > May 7 13:18:41 db1 kernel: rh_cmd 5 > > May 7 13:18:41 db1 kernel: rh_lkid 200242 > > May 7 13:18:41 db1 kernel: lockstate 2 > > May 7 13:18:41 db1 kernel: nodeid 0 > > May 7 13:18:41 db1 kernel: status 0 > > May 7 13:18:41 db1 kernel: lkid ee0388 > > May 7 13:18:41 db1 clurgmgrd[17888]: <notice> Stopping service > WATSON > > This usually is a dlm bug. Once the DLM gets in to this state, > rgmanager blows up. What rgmanager are you using? > > (There's only one lock per service; the complexity of the service > doesn't matter...) > > -- > Lon Hohberger - Software Engineer - Red Hat, Inc. > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > > >
<<winmail.dat>>
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster