On 06/29/2010 10:04 AM, Emmanuel Noobadmin wrote: > So most likely I would run two or more physical machines with VM to > failover to each other to catch situations of a single machine > failure. Along with that a pair of storage server. In the case of a > total failure where both the primary & secondary VM dies physically, > roll in a new machine to load up the VM images still safe on the > gluster data servers. > > So in this case would I be correct that my configuration, assuming a > basic 2 physical VM host server and 2 storage server would probably > look something like > > volume rep0 > type cluster/replicate > option read-subvolume vmsrv0vol0 > subvolumes vmsrv0vol0 datasrv0vol0 datasrv1vol0 > end-volume > > > volume rep1 > type cluster/replicate > option read-subvolume vmsrv1vol0 > subvolumes vmsrv1vol0 datasrv0vol0 datasrv1vol0 > end-volume > > volume my_nufa > type cluster/nufa > option local-volume-name rep0 > subvolumes rep0 rep1 > end-volume > > Or did I lose my way somewhere? :) That looks reasonable to me, except that the last stanza would only apply on vmsrv0. For vmsrv1, you'd want this instead: volume my_nufa type cluster/nufa option local-volume-name rep1 # this is the only difference subvolumes rep0 rep1 end-volume It's a little unfortunate that you can't do this with a single volfile, perhaps with $-variable substitutions or some such, but that's the way it is AFAIK. > Does it make any sense to replicate across all 3 or should I simply > spec the VM servers with tiny drives and put everything on the gluster > storage which I suppose would impact performance severely? That's a pretty murky area. With a fast interconnect it's tempting to say the "storage of record" should be only on the data nodes and the app nodes should only do caching. It would certainly be simpler, though with more than two data nodes you'd have to do essentially the same layering of distribute on top of replicate (nufa wouldn't be particularly useful in that configuration). If you wanted to stick with something more like the above, you'd just need to pair each app node with a data node, so e.g. rep0=vmsrv0vol0+datasrv0vol0 and rep1=vmsrv1vol0+datasrv1vol1. You would probably also want to "cross" the read-subvolume assignments, so for example vol0 would go first to datasrv1vol0 instead of vmsrv1vol0 for rep1. This avoids having the app nodes talk to each other when they could be talking to data nodes instead.