> I would like to add a *lot* more storage so that we can increase our email > quotas (currently 200MB per user). It seems like the proper way to scale > up is to split the Cyrus metadata off and use some large SATA drives for > the message files. I was considering adding a shelf of 1TB SATA drives to > our SAN. I could store the metadata on existing FC drives on the SAN, or > just use internal disks on the servers. We split our meta/data onto 10k/15k RPM RAID1 for meta vs 7.2k RPM RAID5 for data. The meta is 1/20th the size of the data. The meta drives get more data written to them, the data drives get more data read (we have lots of memory now, so probably lots of meta is cached). On average, utilisation for meta is still higher than data, but they're relatively well balanced with that split it seems. > But then I started thinking about how I was going to backup all this new > data... Our backup administrator isn't too excited about trying to backup > 12TB of email data. We backup to a X4500 server. Bron built our custom backup system for cyrus. Each cyrus machine has a backup daemon that has a simple network protocol. The daemon knows where meta and data files are, and can read and understand cyrus.* files. A backup process on the X4500 runs each day and connects to the daemon on each cyrus machine and uses it to find out changes for each user for each folder and updates the backup on the X4500. All backups are stored in .tgz streams with a copy of every email, every cyrus.* file. Meta data is stored in an sqlite file. In general the backup process just appends to the .tgz stream. When it calculates that the ratio of "old" data in the .tgz is too high, it re-packs the whole thing removing the old data. The whole thing relies a lot on internal knowledge of our setup, so it's not something we can easily release unfortunately. > What if we used Cyrus replication in combination with delayed expunge as a > form of "backup"? We currently only keep 1 month of daily backups > anyways... It's an option, but it's a bit scary still. What if there's a replication protocol error that blows away your replica? Unlikely, but possible. I think we might be a bit paranoid. We don't like relying on any one thing. Filesystems, software, hardware, etc. Net result is we've ended up with quite a few levels of redundancy. 1. All data on RAID so any HD failure is just a replacement HD no downtime at all 2. Delayed delete, so any user deletion error can be fixed by re-inserting the deleted messages 3. All data replicated, so any server/storage unit failure is just switching master/replica 4. Nightly backups to a completely separate server, with different OS and filesystem, and with no shared credentials or trust. Basically a last resort in case of major hardware/OS/security screw up that you absolutely hope you never have to use. Rob ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html