On Fri, May 28, 2010 at 03:49:41PM -0400, Wesley Craig wrote: > On 28 May 2010, at 12:42, Gary Mills wrote: > > 0805e4ee proxy_check_input (815d168, 81a7228, 819e520, 81a3d60, > >81a7700, 0) + 5e > > That last argument to proxy_check_input()? It's the timeout. > Setting it to 0 means "don't time out". I'm sure the theory is that > the underlying select() will return when the backend's poptimeout > happens, and the connection is closed. It would be good to know why > that's not happening as expected. Of course, the fact that bitpipe() > isn't checking the return value of prot_flush() is also bug. Yes, the timeout is set to zero in the pop3d.c file. However, the idle timeout actually works when I test it. In one window, I do this: $ telnet setup01 pop3 Trying 130.179.16.64... Connected to setup01.cc.umanitoba.ca. Escape character is '^]'. +OK testing.umanitoba.ca Cyrus POP3 Murder v2.3.8 server ready user gmills +OK Name is a valid mailbox pass XXXXXX +OK Mailbox locked and ready /* wait for the timeout */ -ERR [SYS/PERM] Fatal error: Lost connection to input stream Connection to setup01.cc.umanitoba.ca closed by foreign host. Sure enough, on the server the new pop3d pop3d process exits after 20 minutes. While it's waiting, the stack trace looks like this: # pstack 13804 13804: pop3d feb1a465 pollsys (8042da0, 2, 8042e60, 0) feac3b8a pselect (d, 8042eb4, feb90318, feb90318, 8042e60, 0) + 18e feac3e80 select (d, 8042eb4, 0, 0, 8042ea8, 0) + 82 0808981b prot_select (8189548, ffffffff, 8043f94, 0, 8042ea8, 0) + 44b 0805e4ee proxy_check_input (8189548, 8145a30, 8145aa8, 814d718, 814d308, 0) + 5e 0805dd74 bitpipe (8145c38, 0, feb921ec, 0, 8044fed, 8044fed) + c4 0805acb7 cmdloop (8135594, 8138980, 14, 2, 31203133, 312e3033) + 27 0805aa53 service_main (1, 8142a50, 8047db8) + 473 08062c13 main (1, 8047db0, 8047db8, feffb818) + a83 08059bbd _start (1, 8047e58, 0, 8047e5e, 8047e69, 8047e7c) + 7d It stays in the pollsys system call the entire time but finally returns with a zero return code. The process then writes that error message to FD 1, has a little dialogue with the back end, and then terminates. The ones I saw before were not stuck in pollsys() however. They were stuck in a read() from FD 0. The timeout didn't work on those, but the TCP keepalive does get them. They had a very short stack trace, like this: # pstack 12708 12708: pop3d -s feb1a5c5 read (0, 817faf0, b) fec2dfaf sock_read () + 3f I don't know why the stack trace is so short with these. -- -Gary Mills- -Unix Group- -Computer and Network Services- ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html