Re: solutions for split brain situation

Anand Avati <avati@xxxxxxxxxxx> · Fri, 18 Sep 2009 23:46:09 +0530

> That all said - the GlusterFS representative responded they have chosen to
> error on the side of "conservative", where they choose to keep the file if
> they cannot find proof that it should be removed, which unintentionally
> supports your model. This being the case, it does lead to supporting your
> position, as you are also looking for "conservative" behaviour in the case
> of an error path during self-heal and backend consistency checks.

GlusterFS healing up files with missing extended attributes is one
thing. The purpose of this "feature" is so that, in those corner cases
where an fsck strips off the xattrs from a few files, those files get
reconstructed (the "conservative" approach). If this was all what was
necessary for a soft migration, why would we not portray it as a
feature? Because there are a few other considerations to be made,
which makes this conservative healing different from soft migration.

When soft migration is claimed to be supported, the user expects the
filesystem to be usable from the moment she mounts it, with
pre-existing data present in one backend. The filesystem is expected
to be "operational" from that very moment. An initial 'ls -lR' would
start recreating the second copy. If the ls -lR goes to completion,
the user is all set to go too. That is the best case scenario. Now
what if the source node goes down in the middle of the healing?
Nothing prevents the user from continuing to use the mountpoint with
only the half-healed second server, and make changes which could be
conflicting with the remaining not-yet-healed half on the first
server? A "supported" soft migration is expected to handle all such
situations. The current self heal does not, and might result in split
brains in those situations. Which is why we say that using the
"conservative" healing as a soft migration "feature" is misusing
(rather, stretching beyond limits) the self heal feature.

In fact what happened in the original post of this thread is very
similar to what I just described. The "soft migration" was interrupted
in the middle, changes were done to the second server (infact directly
in the backend!) which ended up being incompatible with the first
server when everything was brought up. xattrs were obviously
inconsistent and ends up with split brains. The right way would have
been to just copy everything from the mountpoint. Using self heal this
(undocumented) way is really stretching it beyond limits and expecting
what was never a claimed feature.

Avati