I see that the AFR automatic replication module is now listed as complete and
scheduled for the February 20th 1.4 release. While we are eagerly awaiting it,
is there any documentation somewhere about how it operates?
It sounds like this will provide an easily installed and configured, truly
redundant (no single points of failure) high-performance distributed
filesystem, without having to resort to extra layers (such as drbd). In
otherwords, filesystem nirvana. ;-)
Some some-what specific questions:
1) When does replication take place? When a client is writing to a file, does
it write to two (or more) servers at the same time, or does one server
replicate the file to the other server after close, or...? If a replica server
goes down, how does it catch up with changes since it was down? Do the clients
know to use the good replica server while the other server is catching up?
2) Assuming replicas are in sync, will clients load-balance reads across the
multiple replicas for increased performance? What about writes?
3) How are locks handled between replicas?
4) GlusterFS looks to be extremely flexible with regards to its configuration,
but I just want to be sure: if AFR is working with multiple nodes, each
containing multiple disks as part of a filesystem, will we be able to guarantee
that replicas will be stored on different nodes (i.e., so a node can fail and
the data will still be fully available).
A completely vague, general question:
I'd like to run standard departmental services across one or more redundant,
distributed filesystems. This would include home directories and data
directories, as well as directories that would be shared amongst multiple
servers for the express purpose of running redundant (and load-leveled, when
possible) services (mail, web, load-leveled NFS export for any systems that
can't mount the filesystem directly, hopefully load-leveled SAMBA, etc.).
Would GlusterFS (with AFR) be a good match?
I've been working with Lustre+DRBD+Heartbeat, and it seems like it will work,
but the complexity of it makes me nervous (it may be fragile, and prone to
breakage with future updates, especially with its strong dependence still upon
numerous kernel patches that differ by kernel release). GlusterFS sounds much
simpler (thanks to its modularity) and is about to get built-in replication...
Thanks,
Brent Nelson
Director of Computing
Dept. of Physics
University of Florida