Re: Need advice on building a Cyrus IMAP cluster

Dave McMurtrie <dave64@xxxxxxxxxxxxxx> · Mon, 03 Aug 2009 20:21:19 -0400

Michael Sims wrote:

...snipped...

> So, here are my questions for anyone who can help me:
> 
> (1) Is the goal of implementing an active/active Cyrus cluster using shared
> storage and a shared file system a realistic one?

Yes.  It has been done successfully.

> (2) If so, what recommendations do people have for the file system?  GFS?
> OCFS2? something else?

When I worked at the University of Pittsburgh, we set up a 4-node, 
active/active Cyrus IMAP cluster.  It ran on Sun v440 servers running 
Solaris 8 using Veritas Cluster Filesystem.  If you need additional 
details about that setup, pop me an e-mail.  It's been over 4 years 
since I worked there, so I may be sketchy on details at this point. 
Pitt has since replaced their active/active Veritas cluster with an 
active/passive Sun cluster.

I believe the Computer Science department here at Carnegie Mellon is in 
the midst of setting up an active/active Cyrus IMAP cluster.  Since I 
don't work in that department, I don't have much in the way of details. 
  I'm not sure whether Ray follows info-cyrus or not, but he can chime in.

Though I have no experience with it, I seem to recall that someone 
attempted to use GFS with an active/active Cyrus cluster and it was a 
disaster.  It was mentioned either on info-cyrus or in the Cyrus wiki. 
If google doesn't help you find this, I can try to remember where I read it.

> (3) I've seen the "replicated" option for "mupdate_config" mentioned
> multiple times on the list, and reading the documentation gives me the
> impression that it applies to what I want to do, but I'm not 100% sure on
> that.  Can anyone confirm or deny this?

As of Cyrus 2.3, the code supports the notion of application-level 
replication.  It's near real-time replication of all the application 
data, but one copy of the data isn't live.  This is more of an 
active/passive solution, since you have to do something to make cyrus 
aware of the 2nd copy of the data if you suffer some type of failure of 
the first copy.

> (4) Assuming that pursuing the active/active approach is a bad idea, does
> anyone have alternate suggestions for the most efficient way to create a
> cluster that can provide BOTH high availability and load balancing?  I've
> seen references to some setups where there are two nodes, with each being a
> master node for half of the mailboxes and a slave node for the other half,
> and able to take over service for all the mailboxes in the case of failure
> of the other node.  But I can't seem to locate where I saw this setup
> described.  If anyone has any pointers to that, or alternate suggestions,
> I'd appreciate it.

We're doing pretty much what you describe.  Each of our Cyrus mail 
backend servers acts as a replica for one of the other backend servers, 
so we always have 2 complete copies of our data.  Unfortunately, in our 
case the failover would have to be accomplished completely manually and 
wouldn't be fast.  It would, however, be much faster than restoring from 
backup tape in a disaster.  University of Michigan is using replication 
and rsync such that they have 3 copies of their data spread across 
separate data centers.  I'm told they can also fail over quite easily 
when necessary.  If you're interested in doing something like this, you 
may get a few pointers from umich.

Thanks,

Dave
-- 
Dave McMurtrie, SPE
Email Systems Team Leader
Carnegie Mellon University,
Computing Services
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html