Hi, Yesterday morning I did an upgrade from 2.3.7(-11) to 2.3.8(-3) using the RPM packages made by Simon. Everything seemed fine, but at the end of the day the replication stalled; it was actually in a way I did not see before with 2.3.7 (and earlier), so I thought it might be worth reporting. (Just to be sure, maybe someone recognizes it, maybe it rings a bell.) I noticed it because our sync_client bailed out. We monitor this, and actually restart the sync_client process right away using the same script. I always look afterwards if everything is indeed back to normal. Normally the synchronization continues as usual after that restart, this time I needed a restart of the sync_server on the replica (I just restarted the cyrus-master on the replica, actually). Before the restart, it seemed that the syncserver was never in an "unlocked" state. Just an example from the logs of the replica that afternoon: Feb 14 17:17:07 rogge master[7033]: about to exec /usr/lib/cyrus-imapd/sync_server Feb 14 17:17:07 rogge syncserver[7033]: executed Feb 14 17:17:07 rogge syncserver[7033]: accepted connection Feb 14 17:17:07 rogge syncserver[7033]: cmdloop(): startup Feb 14 17:17:07 rogge syncserver[7033]: login: tarwe-ng.surfnet.nl [192.87.109.23] cyrus DIGEST-MD5 User logged in ... nothing further (until another sync_client tried to connect, where the same sequence was repeated). I was not even able to do a manual synchronization, there was no (debug/verbose) output from the (client) process, nothing happened. Normally the unlock does happen (this one is after the restart of the processes on rogge, our replica): Feb 14 18:19:18 rogge syncserver[7344]: accepted connection Feb 14 18:19:18 rogge syncserver[7385]: executed Feb 14 18:19:18 rogge syncserver[7344]: cmdloop(): startup Feb 14 18:19:18 rogge syncserver[7344]: login: tarwe-ng.surfnet.nl [192.87.109.23] cyrus DIGEST-MD5 User logged in Feb 14 18:19:20 rogge syncserver[7344]: Unlocked And after that I get indeed updates, like: Feb 14 18:19:23 rogge syncserver[7344]: seen_db: user paul opened /data/config/imap/user/p/paul.seen Feb 14 18:19:23 rogge syncserver[7344]: Unlocked Feb 14 18:19:25 rogge syncserver[7344]: Unlocked Feb 14 18:19:27 rogge syncserver[7344]: seen_db: user luuk opened /data/config/imap/user/l/luuk.seen Feb 14 18:19:27 rogge syncserver[7344]: Unlocked The system is already happily replicating for another day now, but the fact that it stalled in this way was new to me. Maybe I overlooked it in the other cases (mwah), and I was too impatient this time ;-) but I guess waiting for half an hour to see if the replication continues (while noticing no worthy traffic between the hosts) should be enough. I'm not very worried about this right now, I'd just carefully check the replica every time, and maybe monitor this a bit better. Regards, Paul ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html