On Fri, Oct 22, 2010 at 10:55 AM, Horacio Sanson <hsanson at gmail.com> wrote: > Distributed volume: ?Aggregates the storage of several directories (bricks in > gluster terms) among several computers. The benefit is that you ?can > grow/shrink the volume as you please. The bad part is that ?this offers no > performance/reliability guarantees as files are ?stored randomly among the > disks in the volume. > > Replicated volume: Requires minimum 2 bricks in separate servers. All files are > replicated among the bricks. How many replicas can be configured at volume > creation. Has all the benefits of a Distributed volume plus fail resilience. > > Stripe volume: Requires minimum 2 bricks in separate servers. All files are > splitted in stripes and these stripes are distributed among the bricks of the > volume. How many stripes and which size is configured on volume creation. Has > all the benefits of Replicated volume plus reliability and can improve read > performance for large files as the read is distributed among several machines. 2 comments: 1) Stripe by itself offers no redundancy. You mention that it has "all the benefits of replication" - it actually doesn't. If you use only stripe and lose a brick, your data is corrupt (say you have 4 nodes and 1 is lost, you only have 3/4 of every file stored, which is pretty useless to you). Consider this something akin to RAID0. 2) You can, however, mix and match these translators to your convenience. I'm designing a site at the moment where pairs of nodes are set up in replicate, and then overall all data is striped over each replicate pair. This is somewhat like the concept of RAID10. To answer the original poster's question of "how does the data spread itself?", well that's up to you. My design is to have replicate pairs, and stripe across many of these. You could instead do the reverse, and have striped pairs which all data would replicate over. If you think about it, the latter ends up with less usable storage and no real speed gain. The former ensures that as new storage bricks are added, data is striped across more pairs, and the overall speed benefit is greater. One thing to consider also is that striping means your data is broken into chunks and spread around the cluster. Should something go awry (either physically or logically), then your data could potentially be lost. The "distribute" translator is slightly safer in this regard. If worst comes to worst and you suffer either a logical or physical error destroying part of your data, it's a simple task to just manually mount up the underlying file system and recover at least some of your data (as bricks store only whole files). With that in mind, the "stripe" translator is best suited to sites where very large files are accessed frequently by many clients. I'm planning it for a site where a few 1TB files need to be read in by 30 clients quasi-simultaneously. Starting each client off at slightly different times (even a few seconds apart) means they should theoretically be reading different chunks from different bricks, and the overall bandwidth of the cluster will not bottleneck at any one point. Compare this to a single NFS server with all 30 clients smashing it for the same file, and GlusterFS with stripe is clearly a better option. If your site has many clients accessing relatively small files (even up to a few hundred MB each) in an ad-hoc fashion, then "distribute" is a much safer bet. You'd most likely end up with as good performance as "stripe" site-wide, and have the added benefit of being able to manually recover files from a brick should something go wrong. "Distribute" is certainly my pick for your average business that has lots of unstructured data in the form of documents, images and the like. Ditto for large file stores for things like web farms and whatnot. As above, I'd only consider stripe where VERY large files are accessed by many clients at the same time, and speed is of the essence. -Dan