Looks like this bug: GFS2 - probably lost glock call back https://bugzilla.redhat.com/show_bug.cgi?id=498976 This is fixed in the kernel included in RHEL 5.5. Do a "yum update" to fix it. Ricardo Arguello On Tue, Mar 2, 2010 at 6:10 AM, Emilio Arjona <emilio.ah@xxxxxxxxx> wrote: > Thanks for your response, Steve. > > 2010/3/2 Steven Whitehouse <swhiteho@xxxxxxxxxx>: >> Hi, >> >> On Fri, 2010-02-26 at 16:52 +0100, Emilio Arjona wrote: >>> Hi, >>> >>> we are experiencing some problems commented in an old thread: >>> >>> http://www.mail-archive.com/linux-cluster@xxxxxxxxxx/msg07091.html >>> >>> We have 3 clustered servers under Red Hat 5.4 accessing a GFS2 resource. >>> >>> fstab options: >>> /dev/vg_cluster/lv_cluster /opt/datacluster gfs2 >>> defaults,noatime,nodiratime,noquota 0 0 >>> >>> GFS options: >>> plock_rate_limit="0" >>> plock_ownership=1 >>> >>> httpd processes run into D status sometimes and the only solution is >>> hard reset the affected server. >>> >>> Can anyone give me some hints to diagnose the problem? >>> >>> Thanks :) >>> >> Can you give me a rough idea of what the actual workload is and how it >> is distributed amoung the director(y/ies) ? > > We had problems with php sessions in the past but we fixed it by > configuring php to store the sessions in the database instead of in > the GFS filesystem. Now, we're having problems with files and > directories in the "data" folder of Moodle LMS. > > "lsof -p" returned a i/o operation over the same folder in 2/3 nodes, > we did a hard reset of these nodes but some hours after the CPU load > grew up again, specially in the node that wasn't rebooted. We decided > to reboot (vía ssh) this node, then the CPU load went down to normal > values in all nodes. > > I don't think the system's load is high enough to produce concurrent > access problems. It's more likely to be some misconfiguration, in > fact, we changed some GFS2 options to non default values to increase > performance (http://www.linuxdynasty.org/howto-increase-gfs2-performance-in-a-cluster.html). > >> >> This is often down to contention on glocks (one per inode) and maybe >> because there is a process of processes writing a file or directory >> which is in use (either read-only or writable) by other processes. >> >> If you are using php, then you might have to strace it to find out what >> it is really doing, > > Ok, we will try to strace the D processes and post the results. Hope > we find something!! > >> >> Steve. >> >>> -- >>> >>> Emilio Arjona. >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster@xxxxxxxxxx >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Emilio Arjona. > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster