I'm running a cyrus murder with 2 frontends, 2 backends, and 1 mupdate
master, all v2.2.13. Occasionally, frontend processes seem to hang while
handling IDLE commands from clients. They all show the following in
netstat's output:
tcp 54 0 128.193.4.143:54538 128.193.4.142:143 CLOSE_WAIT
(54 bytes sitting in the Recv-Q from the backend).
gdb shows the following stack:
(gdb) where
#0 0xb7c3d9f8 in select () from /lib/tls/libc.so.6
#1 0x08076c7c in prot_select (readstreams=0x8126668, extra_read_fd=-1,
out=0xbfa11f84,
extra_read_flag=0x0, timeout=0xbfa11f88) at prot.c:1093
#2 0x080537d8 in cmd_idle (tag=0x8145398 "nhqt") at proxyd.c:2678
#3 0x08050dbe in cmdloop () at proxyd.c:1701
#4 0x0804f78c in service_main (argc=2, argv=0x811d008, envp=0xbfa153f0)
at proxyd.c:1306
#5 0x0804c380 in main (argc=2, argv=0x3a805a, envp=0xbfa153f0) at
service.c:533
and strace of this process repeats the following:
open("/var/spool/cyrus/config/msg/shutdown", O_RDONLY) = -1 ENOENT (No such file or directory)
time(NULL) = 1161625450
select(1, [0], NULL, NULL, {60, 0}) = 0 (Timeout)
time(NULL) = 1161625510
There seems to be a bug in the handling of IMAP IDLE commands in proxyd.
I found bug #2762 in the Cyrus bugzilla which fixed a race condition with
IDLE for imapd, but it did not include changes for proxyd. I don't know
if this bug is relevant here in any case.
Has anyone seen this before?
Andy
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html