geo-replicated Master Master cluster

joe at julianfamily.org (Joe Julian) · Wed, 28 Nov 2012 00:34:11 -0800



On 11/26/2012 03:44 AM, Zohair Raza wrote:
> Understand your point, by the time I also tried with NFS like 
> clustering but didn't help
>
> There is master-master geo replication planned in 3.4 
> http://www.gluster.org/community/documentation/index.php/Planning34
>
> I think it is for the same purpose, has anyone got an idea on it?
>
> Regards,
> Zohair Raza
>
> On Mon, Nov 26, 2012 at 2:58 PM, Robert Hajime Lanning 
> <lanning at lanning.cc <mailto:lanning at lanning.cc>> wrote:
>
>     On 11/25/12 23:26, Zohair Raza wrote:
>
>         Hi,
>
>         Thanks for reply,
>
>         Can you please elaborate more on the last line, I understand
>         that read
>         will have no issues. I tried implementing a replicated volume
>         but the
>         problem is gluster starts uploading the file to node2 while
>         copying for
>         example if I have a 500MB file in site1 which is being copied
>         from a LAN
>         machine to node1 copies at the speed of my internet link which
>         I want to
>         get copied at much faster speed (in MBps) as it is LAN.
>
>         Isn't there any way by which I can set synchronization speed
>         or set
>         gluster to sync after the file is copied?
>
>
>     All the smarts are in the client.
>
>     If you have a replica count of 2, then when a client is writing,
>     it is writing to 2 bricks at the same time.  There is no such
>     thing as queuing for later sync.
>
>     What happens if a client at site A is writing to the same file as
>     a client at site B?  If you have a delayed write to a remote site,
>     how do you solve write conflicts?  You would need to completely
>     understand the file format and it's transactional state, so that
>     the 2 separate writes can be merged without corrupting the file.
>
>     If there is a conflict, there is no way to notify the process that
>     was writing, because the write would have already returned as
>     successful, since it was queued for later execution on the file.
>
>     The only way to solve this, is to have synchronized locks and
>     synchronized writes.  It needs to behave like a local filesystem
>     with 2 processes writing.
>
>     Geo-replication solves this by saying one site is the master and
>     all writes happen there.  The other site is a replica of the
>     master, period.  This gives you a single source of truth about
>     file state and no conflicts to mediate.
>
>     For a database with ACID transactions and atomic data structures,
>     you can design the data and data structures for multi-master
>     replication. You can state that the latest update of an atomic
>     structure wins, then design your application around that.  For a
>     filesystem, you can't, as you do not have visibility into the
>     structure of the files.
>
>     The commercial NAS systems that have multi-master capabilities, do
>     it at the block level (not file) and do it synchronously.
>
>     I currently do not know of a way to implement a multi-master
>     asynchronous network filesystem, without introducing the
>     possibility of file corruption.
>
>
I wrote up about the first third of why what you're asking to do is 
difficult on my blog at 
http://www.joejulian.name/blog/why-replicated-filesystems-are-hard/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121128/0d8cc7d4/attachment.html>