On Thu, Oct 22, 2009 at 12:56:03AM -0700, Jon . wrote: > On Wed, Oct 21, 2009 at 9:20 PM, Rob Mueller <robm@xxxxxxxxxxx> wrote: > ... > > > The difference between "in theory this would work" and the practice of > > actually doing it are huge. Basically it works only if you are 100% sure > > that only one side is ever being accessed at a time. eg. IMAP/POP/LMTP/etc. Pretty much. With appropriate fencing, non-local bind and a service IP address that's feasible. But Rob won't let me do it. Fair enough too, it's pretty messy. > ... > > > In other words, DON'T DO THIS. > > > > Rob Yeah, yeah. I know. I could have worded it a bit more strongly. "Nobody's ever done it because it's really tricky to get right and you'll lose data for sure if you don't know what you are doing". > What are the particular bits that could conflict and have undesirable > results? Metadata, messages, entire mailboxes? In this hypothetical > active/active configuration, what exactly what could an IMAP client > potentially do to create undesirable results? Yes. Those things. Any and/or all. Try thinking about a folder rename at one end and a copy/expunge cycle between folders at the other end and resolving the resultant mess. Basically this is tricky stuff that nobody does particularly well. Generic sync is a hard problem[tm], and the Cyrus code doesn't even try. In particular, it doesn't track deltas. To get even halfway good tracking of changes, you need three things: 1) current state of A 2) current state of B 3) state last time A and B were in sync even better is knowing the changes that were made and resolving them. But without even this much information, consider the following. A: UID 5 is SEEN b: UID 5 is UNSEEN what should be the result? > Would it be a huge undertaking to timestamp data that is to be > replicated to another Cyrus daemon for the receiving Cyrus daemon to > know which conflicting pieces of data to drop in favor of newer data? Timestamp each piece of metadata individually, yes - it would be a huge undertaking. > Right now I have a client who needs 130 or so users on a private mail > server and has two cheap 1U Dell servers to work with. Ideally they > are to be put in physically distanced data-centers for redundancy to > one another. > > Combined with the hypothetical replication of timestamped data > describe above, wouldn't setting the fqdn imap.example.com to resolve > two IP addresses so users' IMAP clients can fall back should an IMAP > storage server be unavailable (with at least the most recent data > replication of any kind is able to provide) make for a much simpler > and more elegant solution than DRBD, clustered filesystems, or > introducing more machines just for load balancing / resolving to an > available IMAP daemon? Also, wouldn't timestamps also hypothetically > resolve the inevitable split-brain situations clients would create? > I assume they don't like losing messages. If you really, REALLY want > to go down that path I would at least take FastMail's patch that checks > the GUID if the same message exists on both ends and refuses to overwrite > if the message contents differ. This is half a solution, you then need > to resolve the issue (we LOCALDELETE the original message at both ends so > it doesn't even wind up in the .expunge file, then we append BOTH messages > with brand new UIDs and set the flags they used to have on the master - > finally syncing the resulting mailbox again so both messages are on both > ends - but the code for that isn't in our Cyrus patches - it's a standalone > script) > And that's just for the split brain that results when a machine dies for > whatever reason (it happened last night incidentally - one of our external > RAID units had an "episode" and decided to stop talking to the server. It > looks like a couple of timeouts on a failing drive tickled a firmware bug > and resulted in the inbuilt OS locking up. Software, you have to love it. > Embedded at all layers. So many firmwares to keep up-to-date!) - we don't > have multi-directional replication. ------------------------------------------------------------------------------------------------------------------------ Our approach to utilisation handling has been documented here plenty of times. Basically we run multiple instances of Cyrus on each machine, so every server has both masters and replicas. We can shut down any one machine just by switching roles (a shutdown and restart of each end with new configs) ------------------------------------------------------------------------------------------------------------------------ Dear Bron... have you some configuration examples somewhere about hin kind of structure ?? ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html