On Fri, October 15, 2010 11:41 am, Rodrigo Ventura wrote: > > Hello all, > > > for some time now we have been bugged with this problem: in our setup we have > cyrus-imapd (on HOST1) listening to both imaps:993 and imap:143 ports, the > former for the users, and the latter for a "saslauthd -a rimap" running on > another host (on HOST2; for SMTP AUTH purposes). While the imaps:993 has been > working flawlessly, from time to time imap:143 stops working. The socket is > still there, LISTENing, but when telnet'ing it it does not respond: > > HOST2$ telnet HOST1 imap > Trying HOST1... > Connected to HOST1. > Escape character is '^]'. > > > The connection is established: > > > HOST2$ netstat -ntpe | grep 84545334 > (Not all processes could be identified, non-owned process info > will not be shown, you would have to be root to see it all.) tcp 0 > 0 HOST2:49774 HOST1:143 ESTABLISHED1000 84545334 > 2310/telnet > > > but on the other side: > > HOST1# netstat -ntpae|grep :49774 > tcp 0 0 HOST1:143 HOST2:49774 SYN_RECV 0 0 > - > > > Now, on the HOST1 side, "master" preforks several imapd -s and imapd > processes, but while there is a imapd -s listening to 993: > > # netstat -ntpa|grep 0.0.0.0:993 > tcp 0 0 0.0.0.0:993 0.0.0.0:* LISTEN > 725/imapd > > > it is master who is listening to 143: > > # netstat -ntpa|grep 0.0.0.0:143 > tcp 0 0 0.0.0.0:143 0.0.0.0:* LISTEN > 28090/master > > > But what is really strange is that master does not seem to include the > LISTENing port in its select() call: > > > # netstat -ntpae | grep master > tcp 0 0 0.0.0.0:110 0.0.0.0:* LISTEN > 0 24593785 28090/master tcp 0 0 0.0.0.0:143 > 0.0.0.0:* LISTEN 0 *24593773* 28090/master > tcp 0 0 0.0.0.0:2003 0.0.0.0:* LISTEN > 0 24593803 28090/master tcp 0 0 :::993 > :::* LISTEN 0 24593777 28090/master > tcp 0 0 :::995 :::* LISTEN > 0 24593789 28090/master tcp 0 0 :::110 > :::* LISTEN 0 24593783 28090/master > tcp 0 0 :::143 :::* LISTEN > 0 24593771 28090/master tcp 0 0 :::2000 > :::* LISTEN 0 24593795 28090/master > tcp 0 0 :::2003 :::* LISTEN > 0 24593801 28090/master > > # ls -laF /proc/28090/fd|grep socket > lrwx------ 1 root root 64 Oct 15 09:46*10*-> socket:[24593773] <<< lrwx------ > 1 root root 64 Oct 15 09:46 13 -> socket:[24593777] > lrwx------ 1 root root 64 Oct 15 09:46 16 -> socket:[24593779] lrwx------ 1 > root root 64 Oct 15 09:46 19 -> socket:[24593783] lrwx------ 1 root root 64 > Oct 15 09:46 22 -> socket:[24593785] > lrwx------ 1 root root 64 Oct 15 09:46 25 -> socket:[24593789] lrwx------ 1 > root root 64 Oct 15 09:46 28 -> socket:[24593791] lrwx------ 1 root root 64 > Oct 15 09:46 31 -> socket:[24593795] > lrwx------ 1 root root 64 Oct 15 09:46 34 -> socket:[24593797] lrwx------ 1 > root root 64 Oct 15 09:46 37 -> socket:[24593801] lrwx------ 1 root root 64 > Oct 15 09:46 40 -> socket:[24593803] > lrwx------ 1 root root 64 Oct 15 09:46 43 -> socket:[24593805] lrwx------ 1 > root root 64 Oct 15 09:46 46 -> socket:[24593808] lrwx------ 1 root root 64 > Oct 15 09:46 5 -> socket:[24593753] > lrwx------ 1 root root 64 Oct 15 09:46 7 -> socket:[24593771] > > # strace -p 28090 > Process 28090 attached - interrupt to quit > select(48, [8 11 14 17 20 23 26 29 32 35 38 41 44 47], NULL, NULL, {3, > 544000}) = 1 (in [47], left {2, 300000}) > read(47, "\1\0\0\0\2330\0\0", 8) = 8 [...] > > > The dirty way of solving this is to kill and restart master again, but in the > meantime our users are no longer able to use SMTP AUTH from HOST2... > > Any clues of what is going on here? > > > Cheers, > > > Rodrigo Ventura > ISR / IST > > > PS: after restarting master, it is still not including 143 on its select(), > but someone is responding to 143: > > HOST2$ telnet HOST1 143 > Trying HOST1... > Connected to HOST1. > Escape character is '^]'. > * OK [CAPABILITY IMAP4 IMAP4rev1 LITERAL+ ID STARTTLS AUTH=PLAIN AUTH=OTP > AUTH=DIGEST-MD5 AUTH=CRAM-MD5 SASL-IR COMPRESS=DEFLATE] HOST1 Cyrus IMAP > v2.3.16 server ready > > HOST2$ netstat -ntpa|grep 143 > (Not all processes could be identified, non-owned process info > will not be shown, you would have to be root to see it all.) tcp 0 > 0 HOST2:33584 HOST1:143 ESTABLISHED3398/telnet > > HOST1# netstat -ntpae|grep :33584 > tcp 0 0 HOST1:143 HOST2:33584 ESTABLISHED 96 > 27619330 18847/imapd > Rodrigo, What does your /etc/cyrus.conf look like ? In particular, pay attention to the way you make the Cyrus master distinguish between different service names. On 2.3.15, we run into a very similar situation whereby I had made a distinction between two services using a suffix composed of a hyphen and some text, but Cyrus apparently only used the part before the hyphen in its internal housekeeping. Half of the time one service was responding, half of the time the other one. Very annoying. To make things clearer : problem situation : popserv cmd="pop3d -C ..." popserv-sec cmd="pop3d -s -C ..." problemless situation : popserv cmd="pop3d -C ..." popservsec cmd="pop3d -s -C ..." Finding the solution was helped by us noticing that syslogging was also only done using the first part of the service name (popserv). Hope this helps, Eric Luyten, Computing Centre VUB/ULB. ---- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/