On Wed, 2005-10-05 at 17:08 -0400, Lon Hohberger wrote: > On Mon, 2005-10-03 at 11:23 -0400, Eric Kerin wrote: > > On Sun, 2005-10-02 at 11:06 -0400, DeadManMoving wrote: > > > My cluster is highly instable, just this morning i've realized that > > > the clurgmgrd deamon was dead... > > > > I'm having this same problem on my cluster, I've been planning on > > enabling core dumps for rgmanager once I find a few minutes to restart > > the cluster services. With any luck, that will be today. > > If you see anything, let me know. There's a segfault I'm trying to > track down which this is... I haven't been able to reproduce it > internally :( > I finally got the downtime to enable core dumps, and just noticed that rgmanager crashed (not hung in the segfault loop). After looking at this a bit, this problem is becoming quite strange to me. I don't have any nfs exports in my cluster.conf file, so I don't think that bug applies. But I am seeing really strange data in the backtraces (below) Similar to https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=166109 The thing is, this is a stock RHEL4 U1 Kernel (2.6.9-11.ELsmp) On 64 bit capable Xeon processors, but running on a 32 bit kernel. I can compress the core dump I have and send it, if you like, or run any commands with gdb (and the like) needed. Thanks, Eric [root@auhjpsn01a ~]# gdb /usr/sbin/clurgmgrd GNU gdb Red Hat Linux (6.3.0.0-0.31rh) <SNIP LICENSE+STUFF> This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) core /core.2707 Core was generated by `clurgmgrd'. Program terminated with signal 11, Segmentation fault. #0 0x006bb5e9 in ?? () (gdb) thr a a bt Thread 4 (process 2707): #0 0x006427a2 in ?? () Cannot access memory at address 0xbff3dbcc Thread 3 (process 3917): #0 0x006427a2 in ?? () Cannot access memory at address 0xb75e4318 Thread 2 (process 10987): #0 0x006427a2 in ?? () Cannot access memory at address 0xb4bff28c Thread 1 (process 10986): #0 0x006bb5e9 in ?? () #1 0x00000000 in ?? () -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster