Hi,
I'm experimenting with GlusterFS and I have a few questions that the
documentation seems to leave unanswered.
1) Replication (AFR)
How does this work? I can see from my test setup that the mounted FS has
the same content (files) as the backing directory on the server. The
server configuration only lists the local node and not the other peer
servers. This implies that all replication/syncing is done by the
clients. This in turn implies that the read load can be shared between
the servers (scales to 1*n where n is the number of servers), but the
write load gets sent to each server. This implies that the write
performance scales inversely when using mirroring (1/n). This seems
quite poor. Am I misunderstanding how this works? Do the servers
replicate between themselves? Or does all replication really happen on
the client nodes? How would this handle the condition of writes
happening on the server directly to the backing directory while the
client is trying to write to the same directory/files? Would this work
the same as NFS would or is there a definitive requirement to always
access the data via the glusterfs mount point? (I understand that this
is only possible when using AFR, and not with striping.)
2) Splitbrain
How does the recovery from this situation get handled? Which file wins,
and which file gets clobbered? Is there any scope for conflict
resolution (e.g. as in Coda)?
3) Metadata Storage
When using striping, how does the file data get split, and how/where is
the metadata kept?
4) Fencing and Quorum
Is there error/desync detection, and are there such concepts as fencing
of dead server nodes and quorum to prevent splitbrain operation?
5) Metadata Change Detection
I understand from the documentation tha replication/sync (where
required) happens when opening a file for reading. What about metadata?
Are metadata changes (e.g. file size, modification time) visible to the
clients when the file changes on the server and the client issues an ls?
Or is it necessary to read the file before issuing ls to get the
size/timestamp metadata? What about touching a file? Does this cause the
file to be synced? Would it cause the file to get corrupted if another
node updated the file's content it in the meantime?
Thanks.
Gordan