gordan@xxxxxxxxxx wrote:
On Wed, 7 May 2008, Anand Avati wrote:
The only way I see to ensure data integrity is to have some arbiter vet
all writes. You can try to make that arbiter redundant, but good luck
making it actually distributed.
I've seen the distributed arbiter done in proprietary software, so it
must be possible. The design is pretty clear to me, but I have no idea
where to start integrating the idea into glusterfs, though gluster's the
closest thing to what I need that I've seen in open source.
Can you give some details/links? We would be interested to learn about
it.
I suspect what was referred to was a system where the locks are notified
to every host, not an actually load sharing system. DLM (RHCS/GFS) does
it by multicasting, presumably with acknowledgements being returned from
each connected node. I've not looked at the DLM protocol in great
detail, so I don't know what the details are.
Actually, I was thinking of WANdisco's Multi-site CVS/SVN/MySQL
mirroring software. It's not generalized to the point of being a disk
load sharing system, exactly, but I think the concept and the problems
are the same. They use a quorum locking model and basically journal the
transaction with whichever server they are wrapping for later replay on
the other servers.
There used to be a white-paper on WANdisco's protocol online (I haven't
looked recently). I didn't know much about DLM (and, after reading what
documentation I could find online just now, I don't feel like I know
much more), but it sounds like DLM uses a similar quorum model for locking.
As for the versioning (and perhaps this is relevant to the discussion
taking place in another thread), I don't see how this can be done
without meta-data journaling, so why not make things even simpler and
share a unique version number between all entities changed in a
transaction? So, for any server to acquire an implicit write lock, the
quorum must agree to increment a global transaction ID (which could also
be attached in the FS as a directory and/or file's version number).
Then, as long as any given system knew that its journal/replay was
up-to-date with the latest transaction ID according to the quorum, then
it could trust a file's content without consulting a file-specific
revision number.
If a server was not completely up-to-date, then it would at least have
to synchronize the meta-data journal and consult it to find if a
requested file had any pending writes and decide whether it needed to
synchronize the file before serving it.
Regards,
Derek
--
Derek R. Price
Solutions Architect
Ximbiot, LLC <http://ximbiot.com>
Get CVS and Subversion Support from Ximbiot!
v: +1 248.835.1260
f: +1 248.246.1176