On Thu, 2006-07-06 at 20:53 +0100, Alan Thew wrote: > On Thu, 6 Jul 2006 17:36 , Kjetil Torgrim Homme <kjetilho@xxxxxxxxxx> said: > >we're running Cyrus 2.2.12 on RHEL3U7. about once a day (typically peak > >hours), we have a problem with lmtpd just "hanging": > > > >Jul 6 13:36:27 mail-imap3 lmtpunix[1578]: executed > >Jul 6 13:55:08 mail-imap3 lmtpunix[1578]: accepted connection > >Jul 6 13:55:08 mail-imap3 lmtpunix[1578]: lmtp connection preauth'd as postman > > > >note the 18 minute delay. during this time, there are no deliveries to > >this Cyrus. > >[...] > > > >we are running several instances of Cyrus on each cluster node, and this > >happens simultaneously to all instances on a given node. just correcting myself slightly: this isn't always true, but it always happens to more than one instance at a time. > >looking at the code, it seems the wait happens inside > >service/master.c:main, and probably is related to the lock file. is it > >possible there is a race condition? > > > Are you using db or skiplist? we're using Berkeley-db for mailboxes. now that you mention it, there are some lines like these in the log, but these are the ones immediately preceding and succeeding the above log entries: Jul 6 12:41:48 mail-imap3 lmtpunix[31829]: DBERROR db4: 60 lockers Jul 6 14:56:27 mail-imap3 lmtpunix[2959]: DBERROR db4: 12 lockers these lines come from different Cyrus instances (the original excerpt is from cyrus15, the first locker line is from cyrus03, the second locker line is from cyrus01), so I'm not sure it's directly relevant? the locking of mailboxes seems to happen after the connection has been made. thank you for your input! -- Kjetil T. ---- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html