>I run it directly, outside of master. That way when it crashes, it > can be easily restarted. I have a script that checks that it's > running, that the log file isn't too big, and that there are no log- > PID files that are too old. If anything like that happens, it pages > someone. Ditto, we do almost exactly the same thing. Also if we switch master/replica roles, our code looks for any incomplete log files after stopping the master, and runs those first to ensure that replication was completely up to date. It seems anyone seriously using replication has to unfortunately do these things manually at the moment. Replication just isn't reliable enough, we see sync_client bail out quite regularly, and there's not enough logging to exactly pinpoint why each time. I think there's certain race conditions that still need ironing out, because rerunning sync_client on the same log file that caused a bail out usually succeeds the second time. It would be nice if some code was actually made part of the core cyrus distribution to make this all work properly, including switching master/replica roles. Rob ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html