On 02 Jul 2010, at 09:29, John Madden wrote: > I'm concerned about the listener_lock timeouts. The listener_lock timeout means that the thread waited around for 60 seconds to see if a connection was going to arrive. Since it didn't, it timed out and that thread went away. The pthread routine that you sit in for 60 seconds is pthread_cond_timedwait(). Perhaps your pthread implementation or kernel is implementing a busy wait? >>> Jul 1 15:16:54 imap mupdate[18203]: unready for connections >>> Jul 1 15:16:54 imap mupdate[18203]: synchronizing mailbox list with >>> master mupdate server >> >> are the interesting messages. It says to me that the connection to >> the mupdate master is being lost. However, there ought to be an >> error message to that effect, which I don't see. What's happening on >> the mupdate master? > > On both the frontend and master, mupdate consumes 100% of the cpu > for a few minutes. I agree, it seems like the update is failing > and then restarting. How do I prevent that? It went on like this > for a few hours twice yesterday, then cleared itself up and it > hasn't happened since. > > We have been in the process of adding about 100,000 more users over > the last few days (so 500k mailboxes). Is it possible for a > frontend to get out of sync with the master to the point where > catch-up periods like this become necessary? I thought each > mailbox creation was synchronous across the murder so I'm thinking > not, but the timing is interesting. Mailbox creation is synchronous between a backend and the mupdate master. frontends are streamed updates from the mupdate master, typically every few seconds. So they can definitely get behind. imapd & lmtpd on the frontends "kick" the slave mupdate if a mailbox they are looking for is missing. The kick is meant to ensure that the slave mupdate is up to date. I don't think the problem is adding the mailboxes, per se. The only time a slave tries to resync is when the connection to the master is lost, or the slave THINKS the connection to the master is lost. If the mupdate master is very busy doing something else and can't respond to NOOPs issued by mupdate slaves, then the slaves will consider the connection to be lost, drop the connection, and attempt to resync. Since resyncing is a resource intensive activity (and single-threaded on the mupdate master, to boot), this resync can begin a thrashing cycle of dropped connections between the mupdate slaves and the master. Bad news, and best avoided... > Can I do anything with the prefork parameter for mupdate to spread > things out on more cpu's or increase concurrency? Prefork doesn't do anything useful for mupdate -- it's about forking & accepting connections, not about threads. The mupdate master is multithreaded in many situations. The mupdate slave on the frontends is almost never multithreaded, but it does share code with the mupdate master so you see messages about threads. I suspect that mupdate on master & slave are consuming 100% of CPU on one CPU because the slave is attempting to update. That's a synchronous, single threaded activity on both, so I would expect it to take a lot of CPU and to only be on one CPU. :wes ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html