On 06/29/2010 11:31 PM, Emmanuel Noobadmin wrote: > With the nufa volumes, a file is only written to one of the volumes > listed in its definition. > If the volume is a replicate volume, then the file is replicated on > each of the volumes listed in its definition. > > e.g in this case > volume my_nufa > type cluster/nufa > option local-volume-name rep1 > subvolumes rep0 rep1 rep2 > end-volume > > A file is only found in one of rep0 rep1 or rep2. If it was on rep2, > then it would be inaccessible if rep2 fails such as network failure > cutting rep2 off. Yes, but rep2 as a whole could only fail if all of its component volumes - one on an app node and one on a data node - failed simultaneously. That's about as good protection as you're going to get without increasing your replication level (therefore decreasing both performance and effective storage utilization). > Then when I add a rep3, gluster should automatically start putting new > files onto it. > > At this point though, it seems that if I use nufa, I would have an > issue if I add a purely storage only rep3 instead of an app+storage > node. None of the servers will use it until their local volume reaches > max capacity right? :D > > So if I preferred to have the load spread out more evenly, I should > then be using cluster/distribute? If you want even distribution across different or variable numbers of app/data nodes, then cluster/distribute would be the way to go. For example, you could create a distribute set across the storage nodes and a nufa set across the app nodes, and then replicate between the two (each app node preferring the local member of the nufa set). You'd lose the ability to suppress app-node-to-app-node communication with different read-subvolume assignments, though, and in my experience replicate over distribute doesn't work quite as well as the other way around. Another option, since you do have a fast interconnect, would be to place all of the permanent storage on the data nodes and use storage on the app nodes only for caching (as we had discussed). Replicate pair-wise or diagonally between data nodes, distribute across the replica sets, and you'd have a pretty good solution to handle future expansion.