On 05/11/2011 08:35 AM, Nyamul Hassan wrote: > 1. Can we mount a GlusterFS on a client and expect it to provide > sustained throughput near wirespeed? Is there any subjective > comparison between reading from GlusterFS and reading from local > drives? Does it put extra pressure on the client? As others have said, it depends on what you're doing. I've measured over 900MB/s per server across three servers from one client using 10GbE, but only for very well-behaved workloads - large sequential I/O with many concurrent threads. Use fewer threads or smaller requests, throw in more random or synchronous operations, do more stuff with metadata than data, and your performance will drop from very good to quite poor. > 2. Reliability + Scalability means Distributed Replicated volumes. > Initially this might be enough for our needs, but as our read > requirements grow, the Striped option looks promising. Is it > possible to mix Distributed + Replicated + Striped? It is possible in the code, but there's no configuration support for it. In other words, you can't do it with "gluster volume create" and any "gluster volume set" is likely to undo any manual changes you had made to the volume configuration files. I've generally found the stripe translator to be of very limited use anyway. The overhead from splitting and reassembling requests, and even from just having another translator in the stack, has overwhelmed any advantage from splitting an individual I/O across connections in every test I've done. The only compelling reason I've heard for using the stripe translator has nothing to do with performance. Without striping, the maximum size of a file is limited to the maximum available space on any one brick. You can use striping to distribute the space used by that file across multiple bricks and thus get beyond that limit. > 3. What happens to very large files. Say 100 GB files. Are they > kept as a single file in every node that has the file? Or is it > split up and distributed in blocks? The "distribute" translator (a.k.a. DHT) will place the entire contents of a file onto one of its component subvolumes - a single brick, or a replica set if you're doing replication as well. Without striping, that's the end of the story. With striping, DHT will place the entire file onto one stripe set, which will then store the contents on N bricks or replica sets below that. See above for an explanation of why this might be useful in some very limited cases.