Re: files with unknown state - locking problem?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



German Staltari wrote:
Hi, we have a 6 node cluster with FC4, kernel 2.6.16 and the CVS STABLE branch of the cluster software. Sometimes, some processes (courier imap) hangs in D state. When I execute "ls -la" in the "tmp" directory (the directory is always the same, the same mailbox) of the mailbox that it's triyng to access the process, the answer is really slow and this is the output:

?--------- ? ? ? ? ? 1151074448.M345358P6861_courierlock.qmail-be-04 ?--------- ? ? ? ? ? 1151074497.M326691P7647_courierlock.qmail-be-04 ?--------- ? ? ? ? ? 1151074534.M524707P2198_courierlock.qmail-be-05 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:07 1151074538.M785749P13408_courierlock.qmail-be-03 -rw-r--r-- 1 mailuser mailuser 16 Jun 23 12:09 1151074588.M917441P3132_courierlock.qmail-be-05 -rw-r--r-- 1 mailuser mailuser 16 Jun 23 12:09 1151074593.M62901P3189_courierlock.qmail-be-05 ?--------- ? ? ? ? ? 1151074649.M845223P5214_courierlock.qmail-be-02 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:09 1151074656.M448306P28724_courierlock.qmail-be-06 -rw-r--r-- 1 mailuser mailuser 16 Jun 23 12:07 1151074657.M188653P5302_courierlock.qmail-be-02 ?--------- ? ? ? ? ? 1151074679.M821433P4979_courierlock.qmail-be-05 -rw-r--r-- 1 mailuser mailuser 16 Jun 23 12:07 1151074690.M360083P5741_courierlock.qmail-be-02 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:07 1151074701.M709923P29422_courierlock.qmail-be-06 -rw-r--r-- 1 mailuser mailuser 16 Jun 23 12:07 1151074716.M544858P6016_courierlock.qmail-be-02 -rw-r--r-- 1 mailuser mailuser 16 Jun 23 12:07 1151074731.M21587P6179_courierlock.qmail-be-02 ?--------- ? ? ? ? ? 1151074804.M241436P7410_courierlock.qmail-be-02 ?--------- ? ? ? ? ? 1151074831.M678238P17302_courierlock.qmail-be-03 ?--------- ? ? ? ? ? 1151074917.M42708P8494_courierlock.qmail-be-05 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:08 1151074918.M541477P14716_courierlock.qmail-be-04 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:08 1151074946.M520653P15248_courierlock.qmail-be-04 ?--------- ? ? ? ? ? 1151075037.M234721P11020_courierlock.qmail-be-02 ?--------- ? ? ? ? ? 1151075065.M951224P8598_courierlock.qmail-be-01 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:09 1151075082.M788480P11712_courierlock.qmail-be-02 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:09 1151075186.M911867P18565_courierlock.qmail-be-04 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:08 1151075210.M366861P13891_courierlock.qmail-be-02 -rw-r--r-- 1 mailuser mailuser 17 Jun 23 12:09 1151075217.M850817P13366_courierlock.qmail-be-05 ?--------- ? ? ? ? ? 1151075252.M599978P32483_imapuid_4.qmail-be-05

It seems like a lock problem, but not sure. Is there any other tool that I can use to debug this?
Thanks
German

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
Hi German,

I suspect you are right: The question marks in ls -l leads me to believe there might be a problem somewhere regarding the locking of the files. My theory is this: ls -l calls a kernel stat function to get file statistics. The stat tries to acquire an internal lock (glock), but can't, so it displays what you see instead of valid values.

Perhaps courier imap is locking files, then hanging, and the process is somehow hanging around with the lock intact, or else killed abnormally where the lock is not released.
Do you have any suggestions how we can recreate this problem in our lab?

Regards,

Bob Peterson
Red Hat Cluster Suite

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux