Hi all, Spent the day reading the docs, blog posts, this mailing list, and lurking on IRC, but still have a few questions to ask. My goal is to implement a cross-availability-zone file system in Amazon EC2, and ensure that even if one server goes down, or is rebooted, all clients can continue, reading from/writing to a secondary server. The primary purpose is to share some data files for running a web site for an open source project - a Mercurial repository and some shared data, such as wiki images - but the main code/images/CSS etc for the site will be stored on each instance and managed by version control. As we have 150GB ephemeral storage (aka instance store, as opposed to EBS) free on each instance, I thought it might be good if we were to use that as the POSIX backend for Gluster, and have a complete copy of the Mercurial repository on each system, with each client using its local brick as the read subvolume for speed. That way, you don't need to go to the network for reads, which ought to be far more common than writes. We want to have the files available to seven servers, four in one AZ and three in another. I think it best if we maximise client performance, rather than replication speed; if one of our nodes is a few seconds behind, it's not the end of the world, but if it consistently takes a few seconds on every file write, that would be irritating. Some questions which I hope someone can answer: 1. Somewhat obviously, when we turn on replication and introduce a second server, write speed to the volume drops drastically If we use client-side replication, we can have redundancy in servers. Does this mean that GlusterFS client blocks, waiting for the client to write to every server? If we changed to server-side replication, would this background the replication overhead? 2. If we were to use server-side replication, should we use the write-behind translator in the server stack? 3. I was originally using 3.0.2 packaged with Ubuntu 10.04, and have tried upgrading to 3.0.5rc7 (as suggested on this list) for better performance with the quick-read translator, and other fixes. However, this actually seemed to make write performance *worse*! Should this be expected? (Our write test is totally scientific *cough*: we cp -a a directory of files onto the mounted volume.) 4. Should I expect a different performance pattern using the instance storage, rather than an EBS volume? I found this post helpful - http://www.sirgroane.net/2010/03/tuning-glusterfs-for-apache-on-ec2/ - but it talks more about reading files than writing them, and it writes off some translators as not useful because of the way EBS works. 5. Is cluster/replicate even the right answer? Could we do something with cluster/distribute - is this, in effect, a RAID 10? It doesn't seem that replicate could possibly scale up to the number of nodes you hear about other people using GlusterFS with. 6. Could we do something crafty where you read directly from the POSIX volume but you do all your writes through GlusterFS? I see it's unsupported, but I guess that is just because you might get old data by reading the disk, rather than the client. Any advice that anyone can provide is welcome, and my thanks in advance! Regards Craig