On Fri, 9 May 2008, Marcus Herou wrote:
So basically the only case when a server is disconnected is when I take
it down for maintenance
Or your switch / NIC dies. Or anything else goes wrong that prevents the
servers from staying connected (e.g. a server crash).
and when it comes up it will self-heal right? And by adding
a brand new server it will as well be synched I hope.
In theory - yes.
In practice, there are some caveats. There are some nasty race conditions
that make splitbrain even more dangerous than you might expect.
Files are versioned. Deleting and re-creating a file causes the version to
be reset. Some programs delete and re-create a file rather than modifying
it (e.g. vi does). This has a number of dangerous side effects. If your
disconnected server has an old version that was incrementally modified
(e.g. log file being appended to), it's version will be high. If you
delete and recreate the file, or do something that will have the same
effect (e.g. edit with vi), the version on the working servers will be
reset (low number).
When the server that dropped out reconnects, it's version will be higher
than the new (reset) version, and it's old file will clobber the new file.
This is, IMO, a shockingly dangerous "feature". It means that the old file
on the disconnected server can easily supercede the new file on the
working cluster.
Gordan