On Thu, Aug 09, 2007 at 02:08:11PM +0300, Janne Peltonen wrote: > Hi! > > It appears that is a cyrus system is forcibly shut down, there is a > replication log left (if the replica system wasn't up at the time). Now, > is it safe to delete the log? What about the transactions that are in > the log, is there a way to replay them later? What if the system has > been up and running for a while after the crash / forced shutdown? Is > there a way to extract the mailboxes that have entries in the old > logfile, to call sync_client by hand to make sure that all the mailboxes > are up to date? Or would that be needed? > > Whee. Well, we run the attached perl script every 10 minutes on every machine with a Cyrus instance on it. It has hooks into our infrastructure all over the place, but that's mainly because we run up to about 16 instance of Cyrus (both masters and replicas) on a single host, so we need lots of extra logic to figure out (a) what's supposed to be running, and (b) which process and log files they are! Anyway, the exciting bit is probably this: if (opendir(my $DH, "$ConfDir/sync")) { while (my $item = readdir($DH)) { next unless $item =~ m/^log-(\d+)$/; my $pid = $1; # check if pid exists if (kill(0, $pid)) { next; } my $res = $Slot->RunCommand('sync_client', '-o', '-r', '-f' => "$ConfDir/sync/$item"); # failure if ($? or $res =~ m/\S/) { # figure out what you want to do here... } # success :) else { unlink("$ConfDir/sync/$item"); } } } NOTE: you can probably implement RunCommand directly in terms of system(). Ours is a bit complex because it puts: "sudo -u cyrus /usr/cyrus/bin/" in front of the command and "-C /etc/imapd-$SlotName.conf" after it before passing it through to the ME::Machine version of RunCommand which does a fork, transparently (as much as possible) sshes to the correct machine if needed, does optional per-line handling of responses with a callback function, etc. Very powerful and easy interface, but very integrated in our systems. We also run "checkreplication" which actually makes a pair of imap connections and enumerates through the mailboxes comparing stuff, and another task which does a "du -s" on the sync directory every 2 minutes and logs it to a database allowing our status tools to inform us of any replications which are falling behind (as well as emailed notifications). Our failover script also checks that DB value for both freshness and lowness before it tries to fail over, and after shutting down the Cyrus master it attempts to run all remaining lots and bails out if it can't. Bron. ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html