> I assume that the frontend is also the murder master? No, the mupdate master is on another server. > Other pertinent questions are, "How many connection do you normally get?" Just now this number of daemons are running: Imapd/s: 327/458 Pop3d/s: 10/5 Theymay get up to 700 ... The limit on imapd / imaps in cyrus.conf is 1000 each. We get 6 - 10 IMAP/POP logins per second. Ok, counting: 5 connections to the failing backend per second, 10 seconds timeout ... there should be enough free daemons available. > and "In what way is the backend 'down'?" Machine down ... crash, not reachable (not just imapd down). Only one IPv4 address involved. > The client_timeout sets an alarm that interrupts the connect system > call. The frontend may try more than once, tho, if the backend has > more than one address, e.g., IPv4 and IPv6. Are you observing imapd > and pop3d on the frontend that are waiting more than client_timeout > to give up? As they fail to connect, clients should log: > > connect(server-name) failed: timed out Yes, it's here: connect(server-name) failed: Connection timed out Ok, this is exactly 10 seconds (== client_timeout) after the login message. > Another possibility is that the clients are poorly behaved, e.g., > they are getting an error on SELECT, but don't close the connection > to the frontend. The client_timeout is just controlling the timeout > of the connect from the frontend to the backend, not the duration of > the life of the frontend processes. For imapd, the timeout is 30 > *minutes*. Oh, well. I will get a "chance" to test this situation again next week (kernel upgrade on backends). I'll try with enlarging the maxchild in cyrus.conf and/or decreasing the client_timeout in imapd.conf. Thanks for your ideas and help! - Frank > > I'm running a simple standard murder environment (v2.3.8 on Linux > > x86_64) > > - one frontend, two backends. If one of the two backends is down, > > then the frontend (and the whole system) becomes unavailable after > > some > > minutes: > > > > On the frontend imapd's and pop3d's are started till the maximum > > count (maxchild in cyrus.conf) is reached. It seems that they're still > > trying to reach the "dead" backend server. > > > > Maybe it is a timeout issue - I let the default client_timeout > > (10 seconds) in imapd.conf. Is this value relevant for this behavior? -- Email: Frank.Richter@xxxxxxxxxxxxxxxxxx http://www.tu-chemnitz.de/~fri/ Work: Computing Services, Chemnitz University of Technology, Germany ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html