Hi everybody, I've noticed a weird behavior in a two-host setup. Consider the following situation. I set up two hosts to act as Master and Replica. I sent a message to Master that pushed it to Replica. I sent two more messages to Master, and they were synced just fine too. Effectively, I can continue to send messages in a similar fashion as long as I want to. But what I did next was swapped the roles of the two hosts so that Master became Replica and vice versa. I continued to send messages to a new Master then, and it pushed the changes to a new Replica. Everything seems to work fine. Or does it? I decided to make a break and left the two boxes to idle. When I returned I sent one more message to the new Master and this is what I saw in logs: ======================== Master Jul 22 13:24:19 imap2-replica sendmail[3179]: p6MAOJvn003177: to=<zxy@localhost.localdomain>, delay=00:00:00, xdelay=00:00:00, mail er=cyrusv2, pri=120338, relay=localhost, dsn=2.0.0, stat=Sent Jul 22 13:24:20 imap2-replica sync_client[3157]: MAILBOX received NO response: IMAP_MAILBOX_CRC Checksum Failure Jul 22 13:24:20 imap2-replica sync_client[3157]: CRC failure on sync for user.zxy, trying full update Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: record mismatch with replica: user.zxy more recent on replica Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: master uid:7 modseq:10 last_updated:1311327073 internaldate:1311327073 flags:(NonJunk) Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: replica uid:7 modseq:11 last_updated:1311327590 internaldate:131132707 3 flags:(NonJunk) Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: record mismatch with replica: user.zxy more recent on replica Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: master uid:7 modseq:10 last_updated:1311327073 internaldate:1311327073 flags:(NonJunk) Jul 22 13:24:20 imap2-replica sync_client[3157]: SYNCNOTICE: replica uid:7 modseq:11 last_updated:1311327590 internaldate:131132707 3 flags:(NonJunk) Jul 22 13:24:20 imap2-replica sync_client[3157]: user.zxy: same message appears twice 8 9 Jul 22 13:24:20 imap2-replica sync_client[3157]: Unlinking files in mailbox user.zxy Jul 22 13:24:20 imap2-replica sync_client[3157]: do_folders(): update failed: user.zxy 'Bad protocol' Jul 22 13:24:20 imap2-replica sync_client[3157]: Error in do_sync(): bailing out! Bad protocol Jul 22 13:24:20 imap2-replica sync_client[3157]: Processing sync log file /var/lib/imap/sync/log-3157 failed: Bad protocol ======================== Replica Jul 22 15:40:06 imap2-master syncserver[3782]: executed Jul 22 15:40:06 imap2-master syncserver[3782]: accepted connection Jul 22 15:40:06 imap2-master syncserver[3782]: cmdloop(): startup Jul 22 15:40:06 imap2-master syncserver[3782]: login: replica.localdomain [10.10.0.129] cyrus PLAIN User logged in Jul 22 15:40:06 imap2-master syncserver[3782]: IOERROR: failed to parse /var/spool/imap/sync./3782/5906c967e91e2d9ffec225fdc502ba8d 0d8739cb Jul 22 15:40:06 imap2-master syncserver[3782]: IOERROR: failed to append file user.zxy 4 I was able to reproduce the described scenario 4 times in a row. Each time these errors crop up ONLY after a long enough pause (say 30 minutes) in between the acts of sending messages to the Master host. I wrestled with this issue throughout the entire Friday last week only to see no changes at all. So, I had to put this on hold for the week-end time and resolved to get back to the issue next week, which I did today. So, I left the boxes to idle last Friday evening, and when I came this morning to the office and checked the logs, they didn't have the above mentioned errors at all. Almost no sync_* related messages at all, in fact. OK, so I tried to send another new message to the host configured currently to act as a Master, and surprisingly it was delivered to the mailbox as well as pushed to the Replica host. I sent some more messages and they were delivered too, as well as successfully synced up. So, I left the boxes to idle again for some 30+ minutes to attempt to reproduce the error messages, and sent one more message after 30+ minutes. This time I've seen a similar error message: ======================== Master ler=cyrusv2, pri=120406, relay=localhost, dsn=2.0.0, stat=Sent Jul 25 12:25:24 imap2-replica sync_client[17397]: MAILBOX received NO response: IMAP_MAILBOX_CRC Checksum Failure Jul 25 12:25:24 imap2-replica sync_client[17397]: CRC failure on sync for user.zxy, trying full update Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: record mismatch with replica: user.zxy more recent on replica Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: master uid:8 modseq:11 last_updated:1311583915 internaldate:131158391 5 flags:(NonJunk) Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: replica uid:8 modseq:12 last_updated:1311584194 internaldate:13115839 15 flags:(NonJunk) Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: record mismatch with replica: user.zxy more recent on replica Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: master uid:8 modseq:11 last_updated:1311583915 internaldate:131158391 5 flags:(NonJunk) Jul 25 12:25:24 imap2-replica sync_client[17397]: SYNCNOTICE: replica uid:8 modseq:12 last_updated:1311584194 internaldate:13115839 15 flags:(NonJunk) Jul 25 12:25:24 imap2-replica sync_client[17397]: user.zxy: same message appears twice 9 10 Jul 25 12:25:24 imap2-replica sync_client[17397]: Unlinking files in mailbox user.zxy Jul 25 12:25:24 imap2-replica sync_client[17397]: do_folders(): update failed: user.zxy 'Bad protocol' Jul 25 12:25:24 imap2-replica sync_client[17397]: Error in do_sync(): bailing out! Bad protocol Jul 25 12:25:24 imap2-replica sync_client[17397]: Processing sync log file /var/lib/imap/sync/log-17397 failed: Bad protocol ======================== Replica Jul 25 12:25:20 imap2-master syncserver[16955]: accepted connection Jul 25 12:25:20 imap2-master syncserver[16955]: cmdloop(): startup Jul 25 12:25:20 imap2-master syncserver[16955]: login: replica.localdomain [10.10.0.129] cyrus PLAIN User logged in Jul 25 12:25:24 imap2-master syncserver[16955]: IOERROR: failed to parse /var/spool/imap/sync./16955/99c7b97f08d7dc94a5335bf6c32719 2e6185e4e6 Jul 25 12:25:24 imap2-master syncserver[16955]: IOERROR: failed to append file user.zxy 7 Now, obviously nothing has been changed in the configuration of the two boxes. They just stood there idling for two days and then they've started to work again. Only for a short time. Can anyone please explain what is going here? Why does it seem cyrus-imap needs some time to somehow fix things on its own? This is my configuration: ======================== Master, cyrus.conf START { recover cmd="ctl_cyrusdb -r" idled cmd="idled" syncclient cmd="/usr/lib/cyrus-imapd/sync_client -r" } SERVICES { imap cmd="imapd" listen="imap" prefork=5 imaps cmd="imapd -s" listen="imaps" prefork=1 lmtpunix cmd="lmtpd" listen="/var/lib/imap/socket/lmtp" prefork=1 } EVENTS { checkpoint cmd="ctl_cyrusdb -c" period=30 delprune cmd="cyr_expire -E 3" at=0400 tlsprune cmd="tls_prune" at=0400 } ======================== Master, imapd.conf configdirectory: /var/lib/imap partition-default: /var/spool/imap admins: cyrus sievedir: /var/lib/imap/sieve sendmail: /usr/sbin/sendmail hashimapspool: true sasl_pwcheck_method: saslauthd sasl_mech_list: PLAIN tls_cert_file: /etc/pki/cyrus-imapd/cyrus-imapd-my.pem tls_key_file: /etc/pki/cyrus-imapd/cyrus-imapd-my.pem tls_ca_file: /etc/pki/tls/certs/cacert.pem # uncomment this if you're operating in a DSCP environment (RFC-4594) # qosmarking: af13 sync_host: master.localdomain sync_authname: cyrus sync_password: xxxxxxxxxxxxxxx sync_log: 1 allowplaintext: 1 tcp_keepalive: 1 guid_mode: sha1 delete_mode: immediate ======================== Replica, cyrus.conf START { recover cmd="ctl_cyrusdb -r" idled cmd="idled" } SERVICES { imap cmd="imapd" listen="imap" prefork=5 imaps cmd="imapd -s" listen="imaps" prefork=1 lmtpunix cmd="lmtpd" listen="/var/lib/imap/socket/lmtp" prefork=1 syncserver cmd="/usr/lib/cyrus-imapd/sync_server" listen="csync" } EVENTS { checkpoint cmd="ctl_cyrusdb -c" period=30 delprune cmd="cyr_expire -E 3" at=0400 tlsprune cmd="tls_prune" at=0400 } ======================== Replica, imapd.conf configdirectory: /var/lib/imap partition-default: /var/spool/imap admins: cyrus sievedir: /var/lib/imap/sieve sendmail: /usr/sbin/sendmail hashimapspool: true sasl_pwcheck_method: saslauthd sasl_mech_list: PLAIN tls_cert_file: /etc/pki/cyrus-imapd/cyrus-imapd-my.pem tls_key_file: /etc/pki/cyrus-imapd/cyrus-imapd-my.pem tls_ca_file: /etc/pki/tls/certs/cacert.pem # uncomment this if you're operating in a DSCP environment (RFC-4594) # qosmarking: af13 allowplaintext: 1 tcp_keepalive: 1 guid_mode: sha1 delete_mode: immediate ======================== Software CentOS 5 Final on x86_64 [root@clone-machine ~]# yum list installed |grep cyrus cyrus-imapd.x86_64 2.4.10-1 installed cyrus-imapd-debuginfo.x86_64 2.4.10-1 installed cyrus-imapd-devel.x86_64 2.4.10-1 installed cyrus-imapd-perl.x86_64 2.4.10-1 installed cyrus-imapd-utils.x86_64 2.4.10-1 installed cyrus-sasl.x86_64 2.1.22-4 installed cyrus-sasl-devel.x86_64 2.1.22-4 installed cyrus-sasl-lib.x86_64 2.1.22-4 installed cyrus-sasl-plain.x86_64 2.1.22-4 installed Kolab build ---- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/