On Fri, 18 Apr 2008, Christopher Hawkins wrote:
See: http://www.gluster.org/docs/index.php/Understanding_AFR_Translator At the bottom of the page are examples to initiate the sync. To clarify on this point and some of your other questions in the splitbrain thread: Automatic re-sync due to one server being down is not available yet, but is coming in the next release with an HA translator. For now you can do a total re-sync manually by the method listed above, or allow the cluster re-sync itself over time because accessing a file for a read or write will cause that file to be synced.
I'm aware of the "read 1 byte/line from a file to sync it" approach. The problem I am seeing, however, is that I cannot see files that were created on node2 before node1 was started. If I cannot see them, I cannot read them to sync them.
I am assuming that this is due to the fact that my underlying FS (Reiser4) is having issues with xattrs.
You don't have to AFR from the client side, but you can. You can also do it on the server side, or even both. Part of the beauty of glusterfs is the simple building blocks - you can set it up any number of ways. Personally I don't think n-fold increases in client bandwidth for mirroring is all that bad. How many "mirrors" do you really need?? :-)
Fair. How do I configure server-server mirroring?
The server AFR's to other servers, then unifies the AFR'd volumes, then exports them. The clients mount the export from any given server using round robin dns or something similar (probably will be deprecated once the HA translator is available).
You mean, have servers AFR as clients, then re-export the AFR-ed volume again? GlusterFS on top of GlusterFS?
That way the client needs only N*1 bandwidth (but the servers need N* (num of AFR's)). So if you only need to keep 2x copies of your data, you never need more than 2x the bandwidth. And no matter what cluster filesystem you use, I can't think of a way to get 2x the files without 2x the writes.
Sure, I accept that. I was just asking if there was a way to make the additional writes server-side, because servers are few and clients are many, so n* the server bandwdth will generally be smaller than server*client bandwidth.
There is no fencing and no quorum - the cluster is essentially stateless, which is really great because if you build it right then you can't really have a situation where split brain is possible (ok, VERY, very unlikely).
I can see that it's less of an issue than block-level split-brain, because this would at most lead to the odd file getting corrupted, whereas block-level split-brain would destroy the entire FS very quickly.
All clients connect on the same port, so if you AFR on the client side, say, then it's tough to imagine how one client would be able to write to a server while another client would think it was down, and yet would still have access to another server on the same network and could write to it. Of course if you don't consider these issues at build time, it is possible to set yourself up for disaster in certain situations. But that's the case with anything cluster related... All in all I think it's a tremendous filesystem tool.
I agree, but I don't think split-brain conditions are as few or as preventtable as you are implying. Whenever there is more than 1 server, split-brain is possible. Especially if, for example, you want 2 mirrored servers and each is the client of the mirrored cluster pair. If the connection between the servers fails, each server would continue being able to see it's own mirrored copy and continue working, thus causing a split-brain. File-systems like GFS implement quorum and fencing to prevent this situation.
So, although a split-brain is less terminal than with GFS (file corruption rather than file system corruption), it is still possible.
Gordan