On 17 March 2011 00:39, Mohit Anchlia <mohitanchlia at gmail.com> wrote: > I've had several discussions with different set of people about using > rsync and everyone thinks it's ok to use rsync (2 way) for WAN > replication in active/active data centers as long as it's done using > file system mounted on the client. I am sending this out to this user > list in case anyone sees any problems or better solution. Please > advice. By mean 2-way rsync, do you mean periodically running rsync in both datacenters? If so, there are quite a few issues to do with the exact arguments you want to give rsync, otherwise you could end up losing data. Questions to ask: 1) How are you going to trigger an rsync run? 2) If it's inotify (or similar) based, how are you going to stop the other site from triggering an update? 3) If it's cron, how do you prevent partially transferred files from clobbering the other site? e.g. site A starts to sync to site B and starts to transfer a file FOO, site B then starts to sync to site A and notices the file FOO is different on site B to site A, so transfers it to site A... 4) How to deal with deletions? If a file isn't present on one site, is that because it's been deleted, or not been created? 5) How long will it take to scan the filesystem to build a list of files to sync, if you have lots of small files this could be a non-trivial amount of time. I imagine there are more, but these are the first ones I thought of when I was thinking about how to do this. Of course, it depends on the shape of your data as to whether you have to worry about some of these points. But 1-3 were worrying for me - of course you could create a locking mechanism (first check if rsync is running on the remote node, and don't run if it is) - but it starts to look increasingly complicated. In the end I decided to use GlusterFS with replication and bricks in both sites, because performance wasn't as important to me as not having to hack up a sync protocol without application/FS support. Also, my WAN link is very reliable and reasonably low latency. Regards -- Jonathan Barber <jonathan.barber at gmail.com>