Limit the number of lmtpd daemons to around 10 -- that solved the issue for me.. We let sendmail handle the queuing. It is more than likely a locking issue.. Michael Bacon wrote: > What database format are you using for the mailboxes database? What kind > of storage is the "metapartition" (usually /var/imap) on? What kind of > storage are your mail partitions on? > > > --On Thursday, February 28, 2008 2:38 PM -0700 Jeff Fookson > <jfookson@xxxxxxxxxxxxxx> wrote: > >> Folks- >> >> I am hoping to get some help and guidance as to why our installation of >> cyrus-imapd 2.3.9 >> is unusably slow. Here are the specifics: >> >> The software is running on a 1.6GHz Opteron with 2Gb memory supporting a >> user base of about 400 >> users. The average rate of arriving mail is on the order of 1-2 >> messages/sec. The active mailstore >> is about 200GB. There are typically about 200 'imapd' >> processes at a given time and a hugely varying number of 'lmtpds' (from >> about 6 to many hundreds during >> times of greatest pathology). System load is correspondingly in the 2-15 >> range, but can spike to 50-70! >> >> Our users complain that the system is extremely sluggish during the day >> when the system is most busy. >> >> The most obvious thing we observe is that both the lmtpds and the imapds >> are spending HUGE times waiting >> on locks. Even when the system load is only 1-2, an 'strace' attached to >> an instance of lmtpd or imapd shows >> waits of upwards of 1-2 minutes to get a write lock as shown by the >> example below (this is from a trace of an 'lmtpd') >> >> [strace -f -p 9817 -T] >> 9817 fcntl(10, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, >> len=0}) = 0 <84.998159> >> >> We strongly suspect that these large times waiting on locks is what is >> causing the slowness our users are reporting. >> >> We are under the impression that a single instance of cyrus-imapd scales >> well up to about 1000 users (with about 1MB active >> memory per 'imapd' process), and so we are baffled as to what might be >> going on. >> >> A non-standard aspect of our installation which may have something to do >> with the problem is that we are >> running cyrus on an lvm2 partition that itself is running on top of >> drbd. Thinking that the remote writes >> to the drbd secondary might be causing delays, we put the primary in >> stand-alone mode so that the drbd layer >> was not doing any network activity (the drbd link is running at gigabit >> speed on its own crossover cable to >> the secondary box) and saw no significant change in behavior. Any issues >> due to locking and the lvm2 layer >> would, of course, still be present even with drbd's activity reduced to >> just local writes. >> >> Can anyone suggest what we might do next to debug the problem further? >> Needless to say, our users get >> extremely unhappy when trivial operations in their mail clients take >> over a minute to complete. >> >> Thank you for any thoughts or advice. >> >> Jeff Fookson >> >> -- >> Jeffrey E. Fookson, PhD Phone: (520) 621 3091 >> Support Systems Analyst, Principal jfookson@xxxxxxxxxxxxxx >> Steward Observatory >> University of Arizona >> >> ---- >> Cyrus Home Page: http://cyrusimap.web.cmu.edu/ >> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki >> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html > > > > > ---- > Cyrus Home Page: http://cyrusimap.web.cmu.edu/ > Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki > List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html