cyrus chokes after few logins

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alohá!

First off: Sorry für crossposting this but I think it hast to do with
both, cyrus and SASL.

Second off: Beware, long post! (Although a good bit is just logs) :-)

Abstract:
Cyrus or sasl choke after a few POP3 logins following quickly after one another despite high prefork values.

The Problem:
A relatively aggressive POP3 retrieval plugin for exchange (GFI Mailessentials) polls 65 accounts at the same time. The logins for the first few accounts succeed but the remaining tries time out without any trace, leaving the TCP connection in a CLOSE_WAIT state.

The Test:
Using getmail from another sarge machine to pop 65 blank accounts in one go I can reproduce the success of approx. 7 logins within the first second, the remaining take close to the timeout limit or do time out.
Please see the timestamps of the following log:

epsilon:~# tail -f /var/log/mail.log |grep testuser
Oct 13 23:48:16 localhost cyrus/pop3d[18228]: login: [IP_of_querying_server] testuser1 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[21595]: login: [IP_of_querying_server] testuser2 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[25583]: login: [IP_of_querying_server] testuser3 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[14381]: login: [IP_of_querying_server] testuser4 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[1144]: login: [IP_of_querying_server] testuser5 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[11874]: login: [IP_of_querying_server] testuser6 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[21571]: login: [IP_of_querying_server] testuser7 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[27250]: login: [IP_of_querying_server] testuser8 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[5194]: login: [IP_of_querying_server] testuser9 plaintext Oct 13 23:48:17 localhost cyrus/pop3d[19623]: login: [IP_of_querying_server] testuser10 plaintext Oct 13 23:49:15 localhost cyrus/pop3d[12990]: login: [IP_of_querying_server] testuser11 plaintext Oct 13 23:49:37 localhost cyrus/pop3d[19480]: login: [IP_of_querying_server] testuser12 plaintext Oct 13 23:49:37 localhost cyrus/pop3d[5285]: login: [IP_of_querying_server] testuser13 plaintext Oct 13 23:51:32 localhost cyrus/pop3d[22249]: login: [IP_of_querying_server] testuser14 plaintext Oct 13 23:51:32 localhost cyrus/pop3d[35]: login: [IP_of_querying_server] testuser15 plaintext Oct 13 23:52:37 localhost cyrus/pop3d[30733]: login: [IP_of_querying_server] testuser16 plaintext Oct 13 23:52:54 localhost cyrus/pop3d[13558]: login: [IP_of_querying_server] testuser17 plaintext Oct 13 23:52:54 localhost cyrus/pop3d[4925]: login: [IP_of_querying_server] testuser18 plaintext Oct 13 23:53:17 localhost cyrus/pop3d[20686]: login: [IP_of_querying_server] testuser19 plaintext Oct 13 23:53:33 localhost cyrus/pop3d[21112]: login: [IP_of_querying_server] testuser20 plaintext Oct 13 23:53:59 localhost cyrus/pop3d[24909]: login: [IP_of_querying_server] testuser21 plaintext Oct 13 23:54:00 localhost cyrus/pop3d[16212]: login: [IP_of_querying_server] testuser22 plaintext Oct 13 23:54:22 localhost cyrus/pop3d[23280]: login: [IP_of_querying_server] testuser23 plaintext Oct 13 23:55:42 localhost cyrus/pop3d[22569]: login: [IP_of_querying_server] testuser24 plaintext Oct 13 23:55:42 localhost cyrus/pop3d[1855]: login: [IP_of_querying_server] testuser25 plaintext Oct 13 23:56:12 localhost cyrus/pop3d[11290]: login: [IP_of_querying_server] testuser26 plaintext Oct 13 23:56:52 localhost cyrus/pop3d[1508]: login: [IP_of_querying_server] testuser27 plaintext Oct 13 23:56:52 localhost cyrus/pop3d[11878]: login: [IP_of_querying_server] testuser28 plaintext Oct 13 23:57:16 localhost cyrus/pop3d[8838]: login: [IP_of_querying_server] testuser29 plaintext Oct 13 23:58:18 localhost cyrus/pop3d[11535]: login: [IP_of_querying_server] testuser30 plaintext Oct 13 23:58:18 localhost cyrus/pop3d[20806]: login: [IP_of_querying_server] testuser31 plaintext Oct 13 23:58:48 localhost cyrus/pop3d[15110]: login: [IP_of_querying_server] testuser32 plaintext Oct 13 23:59:20 localhost cyrus/pop3d[27262]: login: [IP_of_querying_server] testuser33 plaintext Oct 13 23:59:53 localhost cyrus/pop3d[28168]: login: [IP_of_querying_server] testuser34 plaintext Oct 13 23:59:53 localhost cyrus/pop3d[5339]: login: [IP_of_querying_server] testuser35 plaintext Oct 14 00:01:18 localhost cyrus/pop3d[4315]: login: [IP_of_querying_server] testuser36 plaintext Oct 14 00:01:57 localhost cyrus/pop3d[8924]: login: [IP_of_querying_server] testuser37 plaintext Oct 14 00:01:57 localhost cyrus/pop3d[29483]: login: [IP_of_querying_server] testuser38 plaintext Oct 14 00:03:29 localhost cyrus/pop3d[9070]: login: [IP_of_querying_server] testuser39 plaintext Oct 14 00:03:30 localhost cyrus/pop3d[29472]: login: [IP_of_querying_server] testuser40 plaintext Oct 14 00:04:16 localhost cyrus/pop3d[6504]: login: [IP_of_querying_server] testuser41 plaintext Oct 14 00:04:16 localhost cyrus/pop3d[27329]: login: [IP_of_querying_server] testuser42 plaintext Oct 14 00:04:38 localhost cyrus/pop3d[32760]: login: [IP_of_querying_server] testuser43 plaintext Oct 14 00:05:00 localhost cyrus/pop3d[11606]: login: [IP_of_querying_server] testuser44 plaintext Oct 14 00:05:35 localhost cyrus/pop3d[26602]: login: [IP_of_querying_server] testuser45 plaintext Oct 14 00:05:35 localhost cyrus/pop3d[5887]: login: [IP_of_querying_server] testuser46 plaintext Oct 14 00:06:17 localhost cyrus/pop3d[2847]: login: [IP_of_querying_server] testuser47 plaintext Oct 14 00:06:36 localhost cyrus/pop3d[8207]: login: [IP_of_querying_server] testuser48 plaintext Oct 14 00:07:41 localhost cyrus/pop3d[4035]: login: [IP_of_querying_server] testuser49 plaintext Oct 14 00:07:42 localhost cyrus/pop3d[19858]: login: [IP_of_querying_server] testuser50 plaintext Oct 14 00:08:39 localhost cyrus/pop3d[3667]: login: [IP_of_querying_server] testuser51 plaintext Oct 14 00:08:39 localhost cyrus/pop3d[25244]: login: [IP_of_querying_server] testuser52 plaintext Oct 14 00:09:06 localhost cyrus/pop3d[29777]: login: [IP_of_querying_server] testuser53 plaintext Oct 14 00:09:37 localhost cyrus/pop3d[842]: login: [IP_of_querying_server] testuser54 plaintext Oct 14 00:10:27 localhost cyrus/pop3d[5103]: login: [IP_of_querying_server] testuser55 plaintext Oct 14 00:11:03 localhost cyrus/pop3d[28995]: login: [IP_of_querying_server] testuser56 plaintext Oct 14 00:11:34 localhost cyrus/pop3d[12469]: login: [IP_of_querying_server] testuser57 plaintext Oct 14 00:11:34 localhost cyrus/pop3d[25816]: login: [IP_of_querying_server] testuser58 plaintext Oct 14 00:12:01 localhost cyrus/pop3d[29592]: login: [IP_of_querying_server] testuser59 plaintext Oct 14 00:12:31 localhost cyrus/pop3d[28422]: login: [IP_of_querying_server] testuser60 plaintext Oct 14 00:13:10 localhost cyrus/pop3d[18671]: login: [IP_of_querying_server] testuser61 plaintext Oct 14 00:13:10 localhost cyrus/pop3d[23530]: login: [IP_of_querying_server] testuser62 plaintext Oct 14 00:13:36 localhost cyrus/pop3d[18888]: login: [IP_of_querying_server] testuser63 plaintext Oct 14 00:14:08 localhost cyrus/pop3d[5330]: login: [IP_of_querying_server] testuser64 plaintext Oct 14 00:14:08 localhost cyrus/pop3d[15592]: login: [IP_of_querying_server] testuser65 plaintext



The Thesis:
Ummmm, is sasldb maybe not suited for a high amount of simultaneous requests and a race condition occurs that is resolved only by timeouts? Then again we do have 50-150 logins per minute under normal circumstances - and that's no problem whatsoever.

The setup:

debian sarge cyrus 2.1.18-1+sarge2

epsilon:~/tmp# grep -v \# /etc/cyrus.conf | grep .
START {
        recover         cmd="/usr/sbin/ctl_cyrusdb -r"

        delprune        cmd="/usr/sbin/ctl_deliver -E 3"
        tlsprune        cmd="/usr/sbin/tls_prune"
}
SERVICES {
imap cmd="imapd -U 30" listen="imap" prefork=75 maxchild=250 maxforkrate=0 imaps cmd="imapd -s -U 30" listen="imaps" prefork=5 maxchild=250 maxforkrate=0 pop3 cmd="pop3d -U 30" listen="pop3" prefork=75 maxchild=250 maxforkrate=0 pop3s cmd="pop3d -s -U 30" listen="pop3s" prefork=1 maxchild=250 maxforkrate=0 lmtpunix cmd="lmtpd" listen="/var/run/cyrus/socket/lmtp" prefork=0 maxchild=200 sieve cmd="timsieved" listen="localhost:sieve" prefork=0 maxchild=500 notify cmd="notifyd" listen="/var/run/cyrus/socket/notify" proto="udp" prefork=1
}
EVENTS {
        checkpoint      cmd="/usr/sbin/ctl_cyrusdb -c" period=30
        delprune        cmd="/usr/sbin/ctl_deliver -E 3" at=0401
        tlsprune        cmd="/usr/sbin/tls_prune" at=0401
        squatter      cmd="squatter -r *" period=30
}

interesting bits from imapd.conf:

popminpoll: 1
duplicatesuppression: yes
hashimapspool: true
allowplaintext: yes
sasl_mech_list: PLAIN LOGIN
sasl_minimum_layer: 256
sasl_maximum_layer: 256
sasl_pwcheck_method: auxprop
sasl_auxprop_plugin: sasldb
sasl_auto_transition: no

Also noteworthy:
Postfix authenticates users sending mail via rimap (saslauthd) thus also accessing sasldb. Not at this time of day though, we haven't had a single mail sent in the past 15 min and hardly any pickups either.

Looking at the imapd.conf just now does make me wonder why sasl even allows pop3 plaintext logins without TLS with sasl_minimum_layer set to 256... Then again I never really understood how the sasl-layer figures are calculated, does anyone have a hint for a FM I could R? The manpage for imapd.conf doesn't really tell.

As You can see from the timestamps this is ruining my weekend already, I'm at the brink of kicking the server really really hard and any input would be greatly appreciated. If I can provide any more info...

TIA

Martin

[Index of Archives]     [Info Cyrus]     [Squirrel Mail]     [Linux Media]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux