There is a condition (known) where locks are not being released as they should be. In a forthcoming patch, there is a tunable parameter which allows the purging of unused, yet retained locks by a percentage. I've tested this under conditions which affect my ststem and it was rock solid afterwards. At the time I tested it, you had to make the change after the system was up and running (ie, not a config setting). Hopefully this will make it into update 7. Regards, Corey -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Stanley, Jon Sent: Wednesday, March 08, 2006 1:54 PM To: linux-cluster@xxxxxxxxxx Subject: GFS load average and locking I have a 7 node GFS cluster, plus 3 lock servers (RH AS3U5, GULM locking) that do not mount the filesystem. I have a problem whereby the load average on the system is extremely high (occasionally astronomical), eventually leading to a complete site outage, via inability to access the shared filesystem. I have a couple questions about the innards of GFS that I would be most grateful for someone to answer: The application is written in PHP, and the PHP sessioning is handled via the GFS filesystem as well, if that's important. 1) I notice that I have a lot of processes in uninterruptible sleep. When I attached strace to one of these processes, I obviously found it doing nothing for a period of ~30-60 seconds. An excerpt of the strace (using -r) follows: 0.001224 stat64("/media/files/global/2/6/26c4f61c69117d55b352ce328babbff4.jpg", {st_mode=S_IFREG|0644, st_size=9072, ...}) = 0 0.000251 open("/media/files/global/2/6/26c4f61c69117d55b352ce328babbff4.jpg", O_RDONLY) = 5 0.000108 mmap2(NULL, 9072, PROT_READ, MAP_PRIVATE, 5, 0) = 0xaf381000 0.000069 writev(4, [{"HTTP/1.1 200 OK\r\nDate: Wed, 08 M"..., 318}, {"\377\330\377\340\0\20JFIF\0\1\2\0\0d\0d\0\0\377\354\0\21"..., 9072}], 2) = 9390 0.000630 close(5) = 0 0.000049 munmap(0xaf381000, 9072) = 0 0.000052 rt_sigaction(SIGUSR1, {0x81ef474, [], SA_RESTORER|SA_INTERRUPT, 0x1b2eb8}, {SIG_IGN}, 8) = 0 0.000068 read(4, 0xa239b3c, 4096) = ? ERESTARTSYS (To be restarted) 6.546891 --- SIGALRM (Alarm clock) @ 0 (0) --- 0.000119 close(4) = 0 What it looks like is it hangs out in read() for a period of time, thus leading to the uninterruptible sleep. This particular example was 6 seconds, however it seems that the time is variable. The particular file in this instance is not large, only 9k. I've never seen ERESTARTSYS before, and some googling tells me that it's basically telling the kernel to interrupt the current syscall in order to handle a signal (SIGALRM in this case, which I'm not sure the function of). I could be *way* off base here - I'm not a programmer by any stretch of the imagination. 2) The locking statistics seems to be a huge mystery. The lock total doesn't seem to correspond to the number of open files that I have (I hope!). Here's the output of a 'cat /proc/gulm/lockspace - I can't imagine that I have 300,000+ files open on this system at this point - when are the locks released, or is this even an indication of how many locks that are active at the current time? What does the 'pending' number mean? [svadmin@s259830hz1sl01 gulm]$ cat lockspace lock counts: total: 369822 unl: 176518 exl: 1555 shd: 191501 dfr: 0 pending: 5 lvbs: 2000 lops: 21467433 [svadmin@s259830hz1sl01 gulm]$ Thanks for any help that anyone can provide on this! Thanks! -Jon -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster