Hi Keith, Thanks for the detailed response. On 01/26/09 15:31, Keith Freedman wrote: > At 09:12 AM 1/25/2009, Prabhu Ramachandran wrote: >> With glusterfs I used unify for the 4 partitions on each machine and >> then afr'd the two unified disks but was told that this is not a >> reliable way of doing things and that I'd run into problems when one >> host goes down. > > it depends what happens when a host goes down. > if the issue is "server crashed" then you should be fine doing this with > gluster/HA translator. > > as long as you unify bricks that are all on one server, and AFR to a > unify of bricks all on the other server, then if one server is down, AFR > will only use the unify brick of the server that is up. > when the other server comes back up, things will auto-heal the server > that was down. OK, I'm in luck then. I have two machines say A and B each of which have 4 different sized partitions (different disks) that I unify. So I have a unifyA and a unifyB. I then AFR these two unified bricks. > IF your problem is that a server is temporarily down because an unstable > network connection, then you have a more difficult problem. Well, I only used the setup for a bit till Krishna told me that I was likely to run into problems and I stopped. So I didn't really run into problems in that short period. > Here, if the network connection fails and is back up in short periods of > time you'll alway be experiencing delays as gluster is often waiting for > timeouts, then the server is visible again, it auto-heals, then it's not > visible and it has to timeout. > It'll likely work just fine, but this will seem pretty slow (but no > moreso than an NFS mount behind a faulty connection I suppose). Are you saying I can't use the local files on machine A (even if I am not touching any files on B) when the network is down even though all I am doing is reading and perhaps writing locally on A? That could be a bit of a problem in my case since I usually keep any local builds on the partitions I'd like to share. There could be other problems when one machine is down for maintenance for example (or the disk crashes). > Things can be further complicated if you have some clients that can see > SERVER 1 and other clients that only see SERVER 2. If this happens, > then you will increase the likelihood of a split brain situation and > things will go wrong when it tries to auto-heal (most likely requiring > manual intervention to get back to a stable state). Ahh, but in my case I don't have that problem there are only two machines (currently at any rate). > so the replication features of gluster/HA will most likely solve your > problem. > if you have specific concerns, post a volume config to the group so > people can advise you on a specific configuration. Well, I posted my configuration a while ago here: http://gluster.org/pipermail/gluster-users/20081217/000899.html The attachment is scrubbed which makes it a pain to get from there. I enclose the relevant parts below. Many thanks. cheers, prabhu --- glusterfs-server.vol --- # Unify four partitions (I only display two here for brevity) # Declare the storage directories. volume posix1 type storage/posix option directory /export/sda5/export end-volume # The namespace storage. volume posix-ns type storage/posix option directory /export/sdb6/export-ns end-volume # snip similar blocks for other partitions # The locks for the storage to create bricks. volume brick1 type features/posix-locks option mandatory on # enables mandatory locking on all files subvolumes posix1 end-volume volume brick-ns type features/posix-locks option mandatory on # enables mandatory locking on all files subvolumes posix-ns # Now unify the bricks. volume unify type cluster/unify option namespace brick-ns subvolumes brick1 brick2 brick3 brick4 option scheduler rr end-volume # Serve the unified brick on the network. volume server type protocol/server option transport-type tcp/server subvolumes unify option auth.addr.unify.allow x.y.z.w,127.0.0.1 end-volume ------- EOF ------------ ------- glusterfs-client.vol ---------- # The unified servers: 1 and 2. volume server1 type protocol/client option transport-type tcp/client option remote-host w.x.y.z option remote-subvolume unify end-volume volume server2 type protocol/client option transport-type tcp/client option remote-host w1.x.y.z option remote-subvolume unify end-volume # AFR the two servers. volume afr type cluster/afr subvolumes server1 server2 end-volume -------- EOF ------------------