May I ask how you are doing the actual replication, technically speaking? shared fs, drbd, something over imap? Rob Mueller wrote: >> as fastmail.fm seems to be a very big setup of cyrus nodes, I would be >> interested to know how you organized load balancing and managing disk >> space. >> >> Did you setup servers for a maximum of lets say 1000 mailboxes and >> then you use a new server? Or do you use a murder installation so you >> can move mailboxes to another server once a certain gets too much >> load? Or do you have a big SAN storage with good mmap support behind >> an arbitrary amount of cyrus nodes? > > We don't use a murder setup. Two main reasons. > 1) Murder wasn't very mature when we started > 2) The main advantage murder gives you is a set of proxies > (imap/pop/lmtp) to connect users to the appropriate backends, which we > ended up using other software for, and a unified mailbox namespace if > you want to do mailbox sharing, something we didn't really need either. > Also the unifed mailbox needs a global mailboxes.db somewhere. As it > was, because the skiplist backend mmaps the entire mailboxes.db file > into memory, and we had multiple machines with 100M+ mailboxes.db files, > I didn't really like the idea of dealing with a 500M+ mailboxes.db file. > > We don't use a shared SAN storage. When we started out we didn't have > that much money, so purchasing an expensive SAN unit wasn't an option. > > What we have has evolved over time to our current point. Basically we > now have a hardware set that is quite nicely balanced with regard to > spool IO vs metadata IO vs CPU, and a storage configuration that gives > us replication with good failure capability, but without having to waste > lots of hardware on just having replica machines. > > IMAP/POP frontend - We used to use perdition, but have now changed to > nginx (http://blog.fastmail.fm/?p=592). As you can read from the linked > blog post, nginx is great. > > LMTP delivery - We use a custom written perl daemon that forwards lmtp > deliveries from postfix to the appropriate backend server. It also does > the spam scanning, virus checking and a bunch of other in house stuff. > > Servers - We use servers with attached SATA-to-SCSI RAID units with > battery backed up caches. We have a mix of large drives for the email > spool, and smaller faster drives for meta-data. That's the reason we > sponsored the metapartition config options > (http://cyrusimap.web.cmu.edu/imapd/changes.html). > > Replication - We initial started with pairs of machines, half of each > being a replica and half a master replicating between each other, but > that meant on a failure, one machine became fully loaded with masters. > masters take a much bigger IO hit than replicas. Instead we went with a > system we calls "slots" and "stores". Each machine is divided into a set > of "slots". "slots" from different machines are then paired as a > replicated "store" with a master and replica. So say you have 20 slots > per machine (half master, half replica), and 10 machines, then if one > machine fails, on average you only have to distribute one more master > slot to each of the other machines. Much better on IO. Some more details > in this blog post on our replication trials... > http://blog.fastmail.fm/?p=576 > > Yep, this means we need quite a bit more software to manage the setup, > but now that it's done, it's quite nice and works well. For maintenance, > we can safely fail all masters off a server in a few minutes, about > 10-30 seconds a store. Then we can take the machine down, do whatever we > want, bring it back up, wait for replication to catch up again, then > fail any masters we want back on to the server. > > Unfortunately most of this software is in house and quite specific to > our setup, it's not very "generic" (e.g. it assumes particular disk > layouts and sizes, machines, database tables, hostnames, etc) to manage > and track it all, so it's not something we're going to release. > > Rob > > ---- > Cyrus Home Page: http://cyrusimap.web.cmu.edu/ > Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki > List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html