Quoting Shuvam Misra <shuvam.misra@xxxxxxxxxxxxxx>:
Quoting Bron Gondwana <brong@xxxxxxxxxxx>: > > It's getting better, but it's still not 100% reliable to have > master/master replication between two servers with interactions > going to both sides. > > It SHOULD be safe now to have a single master/master setup with > individual users on one side or the other - but note that nobody > is known to be running that setup successfully yet. > > As for what the point is? I don't know about you, but I run a > 24hr/day shop, and I like to be able to take a server down for > maintainence in about 2 minutes, with users seeing a brief > disconnection and then being able to keep using the service > with minimal disruption. > > Bron. As Bron already mentioned the problems of master/master mode you can easy live without. We run multiple servers, these are paired, each server is running one cyrus instance in as master and one as slave, so that the pairs replicate each other. In case of a crash one server would run two master instances. You only need a way of splitting the users between the servers. That could be DNS, a proxy or murder setup.Are you using local storage on each server for spool and metadata?
We have all cyrus storage on iSCSI-Systems
How good/bad is the idea of using shared storage (an external SAN chassis) and letting multiple servers keep their spool areas there? Can one set up, say, half a dozen servers in a pool, each using a separate LUN for spool+data on a common back-end SAN chassis? Out of the six servers, one would be a hot spare, standing by. If any of the five active servers failed, the standby would be told to mount the failed server's LUN, borrow the failed server's IP address, and start offering services?
That would work, but you would still have a single point of failure if the SAN system chraches or if the filesystem of one backend gets corrupted. We have 6 Servers and 2 independent iSCSI-Systems. Each iSCSI-System holds 3 partitions for active servers and 3 partiotins for replications.
In this proposed model, each user's account is on one "physical" server (i.e. bound to a specific IP address). No load balancing or connection spreading is needed when clients connect. If the site chooses to use Murder, then the proposed model can apply to the back-end while the multiplexer can take care of the front-end. The only thing I'm not sure about is the file system corruption when a node goes down and the time taken for an fsck before the standby node can assume the failed node's role. I wonder whether something like the ext4 will help reduce fsck timings to acceptable levels.
The time checking is one thing, but if you lose data in one partition you have a problem. Restoring files from filebased backup is a pain if you have many small files like cyrus has.
Is this a good idea for a scalable fault-tolerant Cyrus setup? I've been toying with this approach for some time, for a proposed large-system design.
We are testing cyrus murder to ease the work of switching to a replication and back. -------------------------------------------------------------------------------- M.Menge Tel.: (49) 7071/29-70316 UniversitÃt TÃbingen Fax.: (49) 7071/29-5912Zentrum fÃr Datenverarbeitung mail: michael.menge@xxxxxxxxxxxxxxxxxxxx
WÃchterstraÃe 76 72074 TÃbingen
Attachment:
smime.p7s
Description: S/MIME Signatur
---- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/