Is the slow output from df expected? Does it just take considerable time to read a gfs superblock? In my scenario, is it likely that heavy lock load was caused by the combination df and a umount at the same time? Were the gfs recover events in the log prior to the kernel panic normal, or is it possible that I attempted the umount too quickly after mounting? Would r/o mounts decrease lock load and the likelihood of this occurring again? Thanks for the help. I was just about to move this into production and now I'm a little apprehensive. I just want make sure I'm taking the necessary precautions. -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Patrick Caulfield Sent: Wednesday, December 14, 2005 3:54 AM To: linux clustering Subject: Re: dlm caused a kernel panic Jeff Dinisco wrote: > I'm running FC4 (2.6.13-1.1532_FC4smp), dlm-1.0.0-3 and GFS-6.1.0-3. I > have a 3 node cluster. The df command has always been very slow to > return output on my gfs mounted filesystems. Series of events... > > 16:20:00 - node01 was out of the cluster, node02 and node03 were active > with 2 gfs filesystems mounted > 16:22:10 - after joining the cluster, both filesystems were successfully > mounted > 16:22:37 - a df command was attempted by a monitoring script > 16:22:54 - I executed /etc/init.d/gfs stop and it failed because 1 of > the filesystems was busy and could not be umounted (the above df command > may have been the cause, it ended up hanging) > > 16:22:55 - node02 and node03 panicked and were not properly fenced If there was only one node left in the cluster it would not fence the other two because it doesn't have quorum. So it can't be sure that it's not just been cut off from the other two nodes and they might be working fine. > Dec 13 16:22:56 node02 kernel: ------------[ cut here ]------------ > Dec 13 16:22:56 node02 kernel: kernel BUG at > /usr/src/build/627959-i686/BUILD/smp/src/lockqueue.c:1007! I can reproduce this under very heavy lock load, but I'm not sure what's causing it as yet. The "flood" tool I check in to STABLE yesterday is almost guaranteed to cause it. -- patrick -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster