Jeff, Just as a rule of thumb, if you've got problems with Cyrus (or any mail system), 90% of the time they're related to I/O performance. I've never seen drbd used for Cyrus, but it looks like other folks have done it. The combination of drbd+lvm2+ext3 might put you somewhere unpleasant, but I'll have to let the Linux-heads jump in on that one. Beyond that, I don't see anything obviously wrong, but maybe someone who's run it more on Linux can chime in. -Michael --On Thursday, February 28, 2008 3:36 PM -0700 Jeff Fookson <jfookson@xxxxxxxxxxxxxx> wrote: > Michael Bacon wrote: > >> What database format are you using for the mailboxes database? What >> kind of storage is the "metapartition" (usually /var/imap) on? What >> kind of storage are your mail partitions on? > > Databases are all skiplist. Our mail partition and the metapartition are > both on the same filesystem, as we intended that both be part of the same > drbd mirror. That partition is > a linux software RAID 5 (3 SATA disks). On top of the md layer is the > drbd device; on top of that is an lvm2 logical volume; on top of that is > an ext3 filesystem, mounted > as '/var/imap'. The mail is then in /var/imap/mail and the metadata in > /var/imap/config (and we also have /var/imap/certs for the ssl stuff, and > /var/imap/sieve for sieve scripts). > > Thanks. > > Jeff Fookson > >> >> >> --On Thursday, February 28, 2008 2:38 PM -0700 Jeff Fookson >> <jfookson@xxxxxxxxxxxxxx> wrote: >> >>> Folks- >>> >>> I am hoping to get some help and guidance as to why our installation of >>> cyrus-imapd 2.3.9 >>> is unusably slow. Here are the specifics: >>> >>> The software is running on a 1.6GHz Opteron with 2Gb memory supporting a >>> user base of about 400 >>> users. The average rate of arriving mail is on the order of 1-2 >>> messages/sec. The active mailstore >>> is about 200GB. There are typically about 200 'imapd' >>> processes at a given time and a hugely varying number of 'lmtpds' (from >>> about 6 to many hundreds during >>> times of greatest pathology). System load is correspondingly in the 2-15 >>> range, but can spike to 50-70! >>> >>> Our users complain that the system is extremely sluggish during the day >>> when the system is most busy. >>> >>> The most obvious thing we observe is that both the lmtpds and the imapds >>> are spending HUGE times waiting >>> on locks. Even when the system load is only 1-2, an 'strace' attached to >>> an instance of lmtpd or imapd shows >>> waits of upwards of 1-2 minutes to get a write lock as shown by the >>> example below (this is from a trace of an 'lmtpd') >>> >>> [strace -f -p 9817 -T] >>> 9817 fcntl(10, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, >>> len=0}) = 0 <84.998159> >>> >>> We strongly suspect that these large times waiting on locks is what is >>> causing the slowness our users are reporting. >>> >>> We are under the impression that a single instance of cyrus-imapd scales >>> well up to about 1000 users (with about 1MB active >>> memory per 'imapd' process), and so we are baffled as to what might be >>> going on. >>> >>> A non-standard aspect of our installation which may have something to do >>> with the problem is that we are >>> running cyrus on an lvm2 partition that itself is running on top of >>> drbd. Thinking that the remote writes >>> to the drbd secondary might be causing delays, we put the primary in >>> stand-alone mode so that the drbd layer >>> was not doing any network activity (the drbd link is running at gigabit >>> speed on its own crossover cable to >>> the secondary box) and saw no significant change in behavior. Any issues >>> due to locking and the lvm2 layer >>> would, of course, still be present even with drbd's activity reduced to >>> just local writes. >>> >>> Can anyone suggest what we might do next to debug the problem further? >>> Needless to say, our users get >>> extremely unhappy when trivial operations in their mail clients take >>> over a minute to complete. >>> >>> Thank you for any thoughts or advice. >>> >>> Jeff Fookson >>> >>> -- >>> Jeffrey E. Fookson, PhD Phone: (520) 621 3091 >>> Support Systems Analyst, Principal jfookson@xxxxxxxxxxxxxx >>> Steward Observatory >>> University of Arizona >>> >>> ---- >>> Cyrus Home Page: http://cyrusimap.web.cmu.edu/ >>> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki >>> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html >> >> >> >> >> > > > -- > Jeffrey E. Fookson, PhD Phone: (520) 621 3091 > Support Systems Analyst, Principal jfookson@xxxxxxxxxxxxxx > Steward Observatory > University of Arizona > ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html