On Mon, Sep 29, 2008 at 12:37:12PM +0100, Mark Cave-Ayland wrote: > Hi there, > > I'm experiencing a problem with Cyrus 2.3.8 interacting with an Outlook > client and was hoping this would be the right place to get some advice. > > What happens is that periodically (maybe around once a month?) we have > one particular user who contacts us complaining that they are unable > access their mailbox. Generally we always find the same thing: there is > an imapd process accessing his seen DB which is running at 100% CPU. > Once this process is killed then things go back to normal and the user > can log in. > > The latest report we had of this problem happening again was this > morning, and fortunately I was in a position to attack it with gdb and a > file of debug symbols. This showed that the process in question was > getting stuck in a loop in index_expungeuidlist(). I've uploaded the > transcript of my gdb session to > http://pastebin.siriusit.co.uk/cyrus-imapd-gdb.txt for people who are > familiar with cyrus internals. > > The short story appears to be that newseenuids (new) points to an empty > string ('\0') and so the code gets stuck because of the following at > line 532 of imap/index.c in index_checkseen(): > > oldseen = (*old == ':'); oldseen = 0; > Since *old is an empty string, oldseen will always be 0, and so the > while() loop never exits. Unfortunately this is the first time I've ever > looked at cyrus internals, so am not really sure what the seen list > should look like normally. No, BUT. while (oldnext <= uid) { ... if (!*old) oldnext = mailbox->last_uid+1; } if your mailbox is corrupted such that last_uid is less than an actual uid in the mailbox, then you will get an infinite loop here. > The confusing thing is that we have been using these packages for > several clients and this is the *only* particular server and the *only* > user on this server experiencing this problem. The one thing we have > noticed is that this particular user has a larger mailbox compared to > the other users (~1GB) but then it doesn't seem so large as if it would > cause any problems. Yeah, it's a corrupted mailbox. > Finally, one more thing to add is that we have already gone through the > steps of rebuilding the seen DB skiplist using the skiplist.py script > several times when this has happened in the past, and it has made no > difference. No, it won't. You need to fix the mailbox or patch the code to not be put into an infinite loop by a bogus index file. The attached patch might do the trick for you. I just slapped it together on spec. It compiles, that's about all I can offer about it :) Bron ( and no, I can't spell. Tough ;) )
Index: cyrus-imapd-2.3.12p2/imap/index.c =================================================================== --- cyrus-imapd-2.3.12p2.orig/imap/index.c 2008-09-29 22:31:30.000000000 +1000 +++ cyrus-imapd-2.3.12p2/imap/index.c 2008-09-29 22:36:38.000000000 +1000 @@ -584,7 +584,12 @@ else { oldseen = (*old == ':'); oldnext = 0; - if (!*old) oldnext = mailbox->last_uid+1; + if (!*old) { + oldnext = mailbox->last_uid+1; + /* just in case the index is corrupted, don't + * loop forever */ + if (oldnext < uid) oldnext = uid; + } else old++; while (cyrus_isdigit((int) *old)) { oldnext = oldnext * 10 + *old++ - '0'; @@ -602,6 +607,9 @@ newnext = 0; if (!*new) { newnext = mailbox->last_uid+1; + /* just in case the index is corrupted, don't + * loop forever */ + if (newnext < uid) newnext = uid; neweof++; } else new++;
---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html