Re: cyrus replication over a WAN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 10, 2009 at 12:37:51AM -0800, Jon . wrote:
> In the past I found that if Cyrus is restarted on the replica, the
> sync_client on the master server fails. I also saw instances of sync_client
> failing on the master if the replica isn't available. The lower levels
> pertaining to why I don't know (or remember), but I can recreate this and
> post the exact error that is logged if any one on this list is interested.

Yes, it sure does.  Hence the '-o' (try to connect only once) flag that
doesn't cause the master to fail to start if the replica is away!
 
> Here's a bash script executed by the cron daemon on a pair of Cyrus servers
> every minute for watch-dogging sync_client.

[snip]

Interesting, but pretty incomplete.  It doesn't deal with failed log files,
so replication won't be 100%.

> The script checks for the conditions that the server is set as the master
> (which it is set to if Cyrus is started) and that the sync_client is not
> running. If so it runs the sync_client again.

I'll attach our script.  It has to deal with the fact that there are multiple
sync_client and sync_server processes on every machine because we run multiple
instaces of Cyrus all over the place, so it matches on config file names
(we pass them to every process as a command line option, and it appears in
the ps output)  We run it from cron every 10 minutes on every cyrus machine.

(coming back to edit: I've just read through the code again, and I'm not
embarassed to post it ;)  It does depend very heavily on our Perl
infrastructure code - so it's not very portable to anywhere else!  But
the ideas are sound and pretty easy to understand if you know a bit of
Perl!)
 
> You can also use this script to enable bi-directional IMAP replication. Why
> you would want to do such a thing (with a two-server master/replica
> configuration) is if you have SMTP daemons running on both of your Cyrus
> servers and want the "replica"/"passive" (IMAP-side at least) server to be
> able to push mail accepted by it from the outside world onto the master
> Cyrus server. This is probably better (and faster for the user) than simply
> dropping packets from outside SMTP servers (which is what I've seen on
> two-server master/replica Cyrus servers out there, fencing was done through
> setting ports as "filtered'). Because a user's IMAP activity on the master
> server shouldn't conflict with incoming mail, in theory bi-directional Cyrus
> replication for incoming messages from the SMTP daemon is not only optimal
> but also possible without creating a split-brain scenario. I plan to test
> this myself when I have time.

Ouch.  User uploads (or copies) a message into their Inbox as the same time
as a message is delievered at the other end.  Cyrus happily overwrites the
copy on the "replica" end (whichever copy of sync_client happens to run first)

I've got some code to make that better - half written in fact.  I already have
code to bail and leave the situation unresolved, which is better than
overwriting, but it still nukes "future" messages, which is just as bad.
 
> This was all done with Cyrus 2.3.7 on CentOS 5.3 and 5.4 in each case. Your
> mileage may vary and if any of the behavior described above is expected to
> be different with new versions of Cyrus please share.

No, not really.  2.3.7 is OK, though it still has some modseq related issues
and a few skiplist bugs.  

I really, really wouldn't do what you're talking about doing though - make
the MTA on your replica go find the master end and LMTP to it.  The network
traffic will be twice as much (since it will replicate back again), but the
locking and UID creation will be guaranteed to work.

Regards,

Bron.
#!/usr/bin/perl -w

BEGIN { do "/home/mod_perl/hm/ME/FindLibs.pm"; }

use strict;
use warnings;

use ME::FMVars;
use ME::ImapStore;
use ME::ImapSlot;
use ME::Machine;
use MailApp::Admin::Actions;
use IO::LockedFile;
use IO::File;
use Getopt::Std;

my %Opts;
getopts('vn', \%Opts);

my $Name = ME::Machine->Name();

my $ThisHost = ME::FMVars::GetThisHost();
my $Imap = $ThisHost->{imap} || {};
foreach my $SlotName (sort keys %$Imap) {
  my $Slot = ME::ImapSlot->new($SlotName);
  next unless $Slot->Store->ReplicaHostName(); # not replicated

  my $ConfDir = $Slot->CyrusConfigPath();
  my $Lock = IO::LockedFile->new({block => 0}, ">$ConfDir/sync/monitorsync.lock");
  unless ($Lock) {
    print "$SlotName: already locked, skipping\n" if $Opts{v};
    next;
  }

  next unless $Slot->IsMaster(); # we don't audit sync on replicas

  print "Doing slot $SlotName\n" if $Opts{v};

  if (-f "$ConfDir/sync/shutdown") {
    print "$SlotName: shutdown file exists, skipping\n";
    next;
  }
  unless ($Slot->Store->ReplicaSlot->IsRunning()) {
    print "$SlotName: ignoring, replica is down\n" if $Opts{v};
    next;
  }

  my @pids = get_pids($SlotName);

  my @ran;
  if (opendir(my $DH, "$ConfDir/sync")) {
    while (my $item = readdir($DH)) {
      if ($item eq 'log' and not @pids) {
        $item = "slog-$$";
        print ("renaming log to $item\n");
        rename("$ConfDir/sync/log", "$ConfDir/sync/$item");
      }
      next unless $item =~ m/^(?:s\d*)?log-(\d+)$/;
      my $pid = $1 || '';

      # check if pid exists
      if ($pid and my $fh = IO::File->new("</proc/$pid/cmdline")) {
        local $/;
        my $cmdline = <$fh>;
        $fh->close();
        if ($cmdline =~ m/sync_client.*$SlotName/s) {
          print "$SlotName: Skipping log file log-$pid, process is running\n" if $Opts{v};
          next;
        }
      }

      #if ((my $size = -s "$ConfDir/sync/$item" || 0) > 30000) {
        #print "Splitting $item ($size) into 20kb chunks\n";
        #system("sudo -u cyrus split -d -a 4 -C 20000 $ConfDir/sync/$item $ConfDir/sync/s${pid}log-");
        #unlink("$ConfDir/sync/$item");
        #next;
      #}

      print "$SlotName: Syncing file $item\n" if $Opts{v};

      # returns an empty string on success
      my $res = $Slot->RunCommand('sync_client', '-o', '-r', '-f' => "$ConfDir/sync/$item");

      # failure
      if ($? or $res =~ m/\S/) {
        print "$SlotName: Failed $item, notifying ($res, $?)\n" if $Opts{v};
        eval { MailApp::Admin::Actions::NotifyAdmins('email', "$Name/$SlotName sync_client failed on $item", "$res (error: $?)"); };
      }

      # success :)
      else {
        print "$SlotName: Done $item, deleting\n" if $Opts{v};
        push @ran, $item;
        unlink("$ConfDir/sync/$item");
      }
    }
    closedir($DH);
  }
  if (@ran) {
    my $num = @ran;
    eval { MailApp::Admin::Actions::NotifyAdmins('email', "$Name/$SlotName sync_client ran $num leftover logs", join("\n", @ran)); } if $Opts{n};
  }
  unless (@pids) {
    # start a new one
    print "$SlotName: Starting a new sync_client, old one gone away\n" if $Opts{v};
    eval { MailApp::Admin::Actions::NotifyAdmins('email', "$Name/$SlotName sync_client missing, starting a new one"); } if $Opts{n};
    $Slot->RunCommand({Daemon => 1}, 'sync_client', '-r', '-o', '-v');
  }
}

print "Finished\n" if $Opts{v};

sub get_pids {
  my $SlotName = shift;
  my @res;

  if (open(my $FH, "ps axww |")) {
    while (<$FH>) {
      next unless m/sync_client/;
      next if m/ -f/; # specific file, we don't want that
      next unless m/ -r/; # needs to be rolling
      next unless m/imapd-$SlotName/;
      next unless m/^\s*(\d+)/;
      my $Pid = $1;
      push @res, $Pid;
    }
  }

  return @res;
}
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

[Index of Archives]     [Cyrus SASL]     [Squirrel Mail]     [Asterisk PBX]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux