On Monday 26 March 2007 18:05:30 Lon Hohberger wrote: > On Thu, 2007-03-22 at 09:17 +0100, Marc Grimme wrote: > > Hello, > > again we had the same problem as stated in January. We installed the > > hotfix but it didn't help. > > Again the whole cluster freezed, no node was allowed to rejoin the > > fencedomain. > > Any ideas or do you need any more information? > > Thanks and Regards Marc. > > This could be a couple of things, but I am certain it's not the same > problem you had in January (though the symptoms are similar). > > Can you attach logs from the separate machines (instead of a single log > with all the messages interleaved)? It would really help clear things > up. > > -- Lon > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster Find them attached. But it's a crash from yesterday 26.3. 04:00:00 with the same symtoms. Do you have any explanation on how the rgmanager can possibly freeze a whole cluster. Isn't that a DLM bug? -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ** ATIX - Ges. fuer Informationstechnologie und Consulting mbH Einsteinstr. 10 - 85716 Unterschleissheim - Germany Registergericht: Amtsgericht München Registernummer: HRB 131682 USt.-Id.: DE209485962 Geschäftsführung: Marc Grimme, Mark Hlawatschek, Thomas Merz
Mar 26 04:00:06 lilr623a clurgmgrd: [20878]: <info> Executing /etc/init.d/ldap status Mar 26 04:00:26 lilr623a clurgmgrd: [20878]: <info> Executing /usr/local/swadmin/caa/SAP/P06DB status Mar 26 04:00:36 lilr623a clurgmgrd: [20878]: <info> Executing /etc/init.d/ldap status Mar 26 04:00:56 lilr623a clurgmgrd: [20878]: <info> Executing /usr/local/swadmin/caa/SAP/P06DB status Mar 26 04:01:06 lilr623a clurgmgrd: [20878]: <info> Executing /etc/init.d/ldap status Mar 26 04:01:26 lilr623a clurgmgrd: [20878]: <info> Executing /usr/local/swadmin/caa/SAP/P06DB status Mar 26 04:01:36 lilr623a clurgmgrd: [20878]: <info> Executing /etc/init.d/ldap status Mar 26 04:01:56 lilr623a clurgmgrd: [20878]: <info> Executing /usr/local/swadmin/caa/SAP/P06DB status Mar 26 04:02:06 lilr623a clurgmgrd: [20878]: <info> Executing /etc/init.d/ldap status
Mar 26 04:00:03 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06CI status Mar 26 04:00:13 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06SCS status Mar 26 04:00:33 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06CI status Mar 26 04:00:43 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06SCS status Mar 26 04:01:03 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06CI status Mar 26 04:01:13 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06SCS status Mar 26 04:01:33 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06CI status Mar 26 04:01:43 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06SCS status Mar 26 04:02:03 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06CI status Mar 26 04:02:13 lilr623b clurgmgrd: [21882]: <info> Executing /usr/local/swadmin/caa/SAP/P06SCS status Mar 26 04:03:42 lilr623b clurgmgrd[21882]: <err> #49: Failed getting status for RG P06CI Mar 26 04:05:12 lilr623b clurgmgrd[21882]: <err> #51: Failed getting status for RG P06CI Mar 26 04:06:42 lilr623b clurgmgrd[21882]: <err> #49: Failed getting status for RG P06SCS Mar 26 04:08:57 lilr623b clurgmgrd[21882]: <err> #51: Failed getting status for RG P06SCS Mar 26 04:09:42 lilr623b clurgmgrd[21882]: <err> #49: Failed getting status for RG P06CI Mar 26 04:11:12 lilr623b clurgmgrd[21882]: <err> #49: Failed getting status for RG P06SCS Mar 26 04:13:27 lilr623b clurgmgrd[21882]: <err> #51: Failed getting status for RG P06CI Mar 26 04:14:12 lilr623b clurgmgrd[21882]: <err> #51: Failed getting status for RG P06SCS Mar 26 04:15:42 lilr623b clurgmgrd[21882]: <err> #49: Failed getting status for RG P06CI Mar 26 04:17:12 lilr623b clurgmgrd[21882]: <err> #49: Failed getting status for RG P06SCS Mar 26 04:19:27 lilr623b clurgmgrd[21882]: <err> #51: Failed getting status for RG P06CI
Mar 26 04:00:07 lilr623e last message repeated 3 times Mar 26 04:01:37 lilr623e last message repeated 3 times Mar 26 04:02:07 lilr623e clurgmgrd: [21292]: <info> Executing /usr/local/swadmin/caa/SAP/P06TREX01 status
Mar 26 04:00:12 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD002 status Mar 26 04:00:12 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06TREX02 status Mar 26 04:00:42 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD002 status Mar 26 04:00:42 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06TREX02 status Mar 26 04:01:12 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD002 status Mar 26 04:01:12 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06TREX02 status Mar 26 04:01:42 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD002 status Mar 26 04:01:42 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06TREX02 status Mar 26 04:02:12 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD002 status Mar 26 04:02:12 lilr623f clurgmgrd: [21331]: <info> Executing /usr/local/swadmin/caa/SAP/P06TREX02 status Mar 26 04:03:42 lilr623f clurgmgrd[21331]: <err> #49: Failed getting status for RG P06WD002 Mar 26 04:05:12 lilr623f clurgmgrd[21331]: <err> #51: Failed getting status for RG P06WD002 Mar 26 04:06:42 lilr623f clurgmgrd[21331]: <err> #49: Failed getting status for RG P06WD002
Mar 26 04:00:54 lilr623c last message repeated 3 times Mar 26 04:01:54 lilr623c last message repeated 2 times
Mar 26 04:00:14 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06AP002 status Mar 26 04:00:24 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD001 status Mar 26 04:00:44 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06AP002 status Mar 26 04:00:54 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD001 status Mar 26 04:01:14 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06AP002 status Mar 26 04:01:24 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD001 status Mar 26 04:01:44 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06AP002 status Mar 26 04:01:54 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06WD001 status Mar 26 04:02:14 lilr623d clurgmgrd: [20322]: <info> Executing /usr/local/swadmin/caa/SAP/P06AP002 status Mar 26 04:03:45 lilr623d clurgmgrd[20322]: <err> #49: Failed getting status for RG P06WD001 Mar 26 04:06:00 lilr623d clurgmgrd[20322]: <err> #51: Failed getting status for RG P06WD001 Mar 26 04:06:45 lilr623d clurgmgrd[20322]: <err> #49: Failed getting status for RG P06AP002 Mar 26 04:08:15 lilr623d clurgmgrd[20322]: <err> #49: Failed getting status for RG P06WD001 Mar 26 04:09:45 lilr623d clurgmgrd[20322]: <err> #51: Failed getting status for RG P06AP002 Mar 26 04:11:15 lilr623d clurgmgrd[20322]: <err> #51: Failed getting status for RG P06WD001 Mar 26 04:12:45 lilr623d clurgmgrd[20322]: <err> #49: Failed getting status for RG P06AP002
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster