Re: Cyrus Replication (example) [was Re: restore from cyrdump]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/19/2014 06:17 AM, Patrick Goetz wrote:
Nic,

Thanks for that detailed explanation.  I still feel myself somewhat 
stymied by either the documentation (or lack thereof) or perhaps an 
unfortunate case of being somewhat feeble-minded.  Here are some follow 
up comments/questions:


On 12/18/2014 9:59 AM, Nic Bernstein wrote:
I will say that the ability to quiesce the application without halting
it would be most desirable.  Most databases have supported this sort of
thing for ages, and it would be great if one could send a signal to
Cyrus to achieve the same result.
I wonder what would happen if you just stopped lmtp while making a 
snapshot?  Would postfix choke on this and start kicking messages back 
to the sender, or would they get queued for later delivery? 
Alternatively, maybe lmtp could temporarily divert new messages to a 
dummy spool so that postfix/sendmail wouldn't have to know anything 
about this.  This might be the least painful way to implement quiescence 
in cyrus.

But LMTP is only one method affecting the mail store, IMAP and sieve can as well.  Granted one can brute-force this by shutting down network ports and the like, but at that point why not just stop cyrus?

 > His initial suggestion -- stop cyrus, snapshot, restart cyrus -- is
 > reasonable, but we feel that the later suggestion -- stop cyrus, tar
 > up data, start cyrus -- is not.  It takes data offline for too long.
 > That's why the snapshot capability is necessary in any truly suitable
 > server.

I agree.  Here is a substitute proposal (and I'll come back to why I'm 
pushing this point).  Serially

   1. rsync user mail files
   2. rsync configdirectory db files
   3. rsync user mail files again

That should get you reasonably close to what you get with snapshots.

No, not in the least is this close to a snapshot.  Snapshots are instantaneous, or near to it.  The time an rsync takes, even a catch-up, grows with the size of the mail store and the deltas between attempts.  Also, rsync is not well suited to the file-per-message, directory-per-mailbox storage scheme of cyrus, as lots of fstats() result, and this just adds to the time.

I don't understand why one wouldn't use snapshots?  Every modern OS and distro include filesystems or volume managers which support snapshotting, and several, such as Ubuntu, even recommend snapshot-capable partitioning schemes out of the box.  It's just not that hard, and it's exactly the right way to handle this sort of staged backup.
  • Halt cyrus
  • snapshot critical filesystems
    • spool date (/var/spool/imap)
    • config data (/var/lib/imap or /var/imap)
    • metadata (i.e. /var/run/cyrus)
  • start cyrus
  • mount snapshot
  • rsync or otherwise backup from snapshot
  • unmount snapshot
  • (optionally) destroy snapshot
This is so easy to handle via a cron or at job.  Why would one do this?  If the answer is "legacy system," then fine, but legacies can be upgraded or replaced.

If you follow the prescribed cyrus directory structure, then this can be 
simplifed (Arch linux example):

   1. rsync -a --delete /var/imap/user [removable disk/other server]
   2. rsync -a --delete /var/imap   [removable disk/other server]

Once you've rsynced the mail files once, rsyncing them again a short 
time later should be pretty fast.  There does need to be a backup 
solution for people who only have one server, hence can't use 
replication or imapsync to do backups.

There is, snapshots, or hosted mail services (like Fastmail :).

Lastly, as to the use of imapsync to achieve user, mailbox or server
replication,...

So your command line is much like Patrick's example, but with '--user1
<user> --authuser1 <proxyuser> --user2 <user>...'
Of course you must create a proxy user, and Cyrus supports this with the
'proxyserver' directive in imapd.conf (man imapd.conf for details),
i.e.: 'proxyservers:    proxyuser'.
Here is the imapd.conf man page entry for proxyservers:

   proxyservers: <none>
     A list of users and groups that are allowed to proxy for other
     users, separated by spaces. Any user listed in this will be
     allowed to login for any other user: use with caution. In a
     standard murder this option should ONLY be set on backends.
     DO NOT SET on frontends or things won't work properly.

That capitalized "DO NOT SET on frontends" would seem to be cause for 
concern, especially since I don't understand how this works.

Well then, get thee to a website or man page. :-) 
    http://cyrusimap.web.cmu.edu/docs/cyrus-imapd/2.4.17/ag.php

No, seriously, this isn't an issue if you're not using a murder.  A "frontend" is the part of a murder aggregation cluster which proxies for the backend servers which actually hold the mail store.  A murder consists of one or more frontends, one or more backends and a single "mupdate" master, which controls the canonical copy of the mailboxes database.  In a murder, if one wants to set the proxyservers option, one sets it only on the backend machines.

The proxyservers option is exactly the right way to do this.

For people who are
  1. imapsync'ing between machines both behind a firewall
  2. using saslauthd with pam

I thought of this solution:  Temporarily block port 143 traffic on the 
outward facing port of your firewall, and then add the line

   auth  sufficient  pam_permit.so

to the top of /etc/pam.d/imap files on both the sending and receiving 
imap servers.  This should allow you to imapsync the mail stores for 
every user without having to provide passwords.  Once you're done, 
simply remove these lines from the PAM configuration files and unblock 
the port on the firewall.  Yes, this will mean that users won't be able 
to access their mail from outside the firewall while the imapsync is in 
operation, and this is probably only workable for smaller organizations 
where people are not concerned about their coworkers temporarily being 
able to access their mail.  There could probably be a desktop policy to 
handle this as well.

Ouch, that seems a lot harder to me than setting proxyservers.

However, you are 100% correct that replication would appear to be a far 
less complex solution.  After reading through the available 
documentation, it wasn't clear to me that it was possible to do 
replication without setting up a murder, a complexity I was hoping to avoid.

So, here's the feeble-mindedness component:  I didn't completely follow 
your explanation for setting up a replication server.  It would be 
awesome to have a howto for doing this -- is anyone aware of anything 
like this; i.e. howto set up a replication server outside the murder 
context.

Then please take a look at the replication page on the Project Cyrus website:
    http://cyrusimap.org/docs/cyrus-imapd/2.4.17/install-replication.php

Here's my earlier example with the murder components stripped out, and some commenting added:

Both servers (note last entry):
/etc/services
lmtp		24/tcp
imap2		143/tcp
imap2		143/udp
imaps		993/tcp
imaps		993/udp
sieve		4190/tcp
csync		2005/tcp
Master server:
/etc/imapd.conf
...
##
# These configuration parameters are for the master server
# in a replication set

# The list of userids with administrative rights
admins: cyrus

##
# Replication support
# This is how the BACKEND for this host is defined
sync_host: replica.example.com
sync_authname: mailproxy
sync_password: <password>
sync_realm: <if required for your auth scheme>

# Whether to compress the replication stream, important if using WAN links
sync_compress: true

# To enable "rolling" replication, set this to TRUE
# This causes all data altering daemons, such as imapd, lmtpd, etc. to log their
# actions for replication.
sync_log: true

# Minimum interval (in seconds) between replication runs in rolling replication mode.
sync_repeat_interval: 5

# A file whose existence will cause the sync_client to stop at its next opportunity
sync_shutdown_file: /var/run/cyrus/sync_stop
...
/etc/cyrus.conf
...
SERVICES {
	...
	syncclient		cmd="/usr/lib/cyrus/bin/sync_client -r"
	...
Replica server:
## /etc/imapd.conf
...
##
# These configuration parameters are for the replica server in a
# replication cluster

# The list of userids with administrative rights
# For a replica, this must include the user with which the master
# will authenticate
admins: cyrus mailproxy

## 
# Unless you're using TLS between master and replica, add this
force_sasl_client_mech: PLAIN
master_mechs: PLAIN
## /etc/cyrus.conf
...
SERVICES {
	...
	syncserver       cmd="/usr/lib/cyrus/bin/sync_server" listen="csync"
	...
Here's some extra notes:
  • The webpage listed above on replication explains rolling replication (think "log shipping" from the DB world) as well as manual replication.  Check that out.
  • We find that it doesn't hurt to use both rolling and periodic replication, and have cron handle the latter
  • If the master stops listening for csync traffic, when halted for a snapshot, for example, then the sync_server process on the replica will die.  So, we use a nanny cronjob to make sure that one gets started if none are running.
Here's our crontabs for master and replica:

Master:
### Ensure replication is up to date
30 5 * * * /usr/local/sbin/cyrus_user_sync.pl >/dev/null 2>&1
##
### Run quota check script
30 6 * * * /usr/local/sbin/quota-report >/dev/null 2>&1
##
### Update mailbox annotations
45 6 * * * /usr/local/sbin/set_cyrus_annotations.sh >/dev/null 2>&1
##
### Update quotas
*/5 * * * * /usr/local/sbin/cyrus_ldap_quota.pl >/dev/null 2>&1
Replica:
##
# ensure that the sync_client keeps running.  Comment this out
# following promotion from replica to master.
@hourly	/usr/local/sbin/sync_nanny.sh >/dev/null
We'll be happy to share these scripts with anyone who'd care to have a copy, but they might be specific to our use of LDAP to manage account details.  The idea of each, however, is to leverage the account DB, which in our case is almost always LDAP, to maintain, update or alter the cyrus account information.

However, I must be honest and point out that if you're going to go to
the trouble of figuring out how to use imapsync (and possibly pay for
it, to boot) you may as well just set up a replica.  As I've shown,
above, it's just not that hard.
Imapsync is still useful for migrating individual users from one imap 
server to another.  In my case, I'm migrating from a cyrus 2.3.x server 
using Berkeley db metadata files to a cyrus 2.4.x server which will be 
entirely skiplist based.  Understood that you can convert db files to 
skiplists, but I feel most comfortable using imapsync for this.  In this 
use case there are only a handful of users, but they all have extremely 
complex and massive mail folders.

My current plan is to use imapsync for the migration and then 
replication to another dummy server for backup, assuming I can figure 
out how to set up replication.

I strongly recommend against this course of action.  If you're migrating between two boxes, which it sounds like you are, then you're much better off rsyncing the spool data between them (once you've halted cyrus) and then allowing cyrus to perform the necessary DB updates. 

Check the Install-Upgrades page for anything else which changes between your versions of cyrus.  Since you didn't specify which 2.3.x or 2.4.x you're using, I can't tell you what you'll need, but you'll find that info in doc/install-upgrade.html of your version.  If you're installing from packages this may not be included, so do yourself a favor and download a copy for reference.

As the upgrade guide states (emphasis added):
The default type for all databases is now skiplist which is very reliable now, all the bugs are ironed out! Because ctl_cyrusdb -r automatically converts databases between known types, you shouldn't need to do anything, but if you want to keep the old defaults, you'll need to make them explicit in your imapd.conf as follows:
duplicate_db: berkeley-nosync
ptscache_db: berkeley
statuscache_db: berkeley-nosync
tlscache_db: berkeley-nosync
You have said you want skiplist, so you needn't add those settings, just make sure you remove any that exist if you copy your old imapd.conf file over.

If you prefer to manually convert the DB files, you can do this with the supplied cvt_cyrusdb tool:
$ /usr/lib/cyrus/bin/cvt_cyrusdb /tmp/annotations.db berkeley /var/lib/imap/annotations.db skiplist

or for Ubuntu
$ cyrus cvt_cyrusdb /tmp/annotations.db berkeley /var/lib/imap/annotations.db skiplist
Note that in this case, you should NOT rsync the DB files into the new server's /var/lib/imap (or whatever your config directory is) but rather into a holding area, like /tmp, from which you can read them for the DB conversion.

Also, make sure you do all of this as the cyrus user, or you'll end up with permissions problems.

Good luck!
    -nic


Thanks again for your helpful comments!

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

-- 
Nic Bernstein                             nic@xxxxxxxxxxx
Onlight, Inc.                             www.onlight.com
219 N. Milwaukee St., Suite 2a            v. 414.272.4477
Milwaukee, Wisconsin  53202
begin:vcard
fn:Nic Bernstein
n:Bernstein;Nic
org:Onlight, Inc.
adr:Suite 2A;;219 N Milwaukee St.;Milwaukee;WI;53202;USA
email;internet:nic@xxxxxxxxxxx
title:VP Operations
tel;work:414-272-4477 x204
tel;cell:414-807-1734
url:http://www.onlight.com/
version:2.1
end:vcard

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

[Index of Archives]     [Cyrus SASL]     [Squirrel Mail]     [Asterisk PBX]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux