Re: self healing bug continu

Geoff Kassel <gkassel@xxxxxxxxxxxxxxxxxxxxx> · Sat, 4 Apr 2009 00:01:28 +1100

Hi,
   I'd also like to put my vote in for a public specification document 
containing the details of how AFR handles various failure and recovery 
scenarios.

(Thanks for raising this, Gordan.)

Kind regards,

Geoff Kassel.

On Fri, 3 Apr 2009, Gordan Bobic wrote:
> On Fri, 3 Apr 2009 13:34:29 +0200, nicolas prochazka
>
> <prochazka.nicolas@xxxxxxxxx> wrote:
> > It seems to be have a lot of problem with self healing and one it is
> > that glusterfs is using one server a reference ( the first in
> > subvolumes )
> > ( afr_sh_select_source  ?  )
>
> This brings up an interesting point - what is the conflict resolution
> supposed to be? The favorite-child option should be the resolution of last
> resort (i.e. the timestamp metadata is identical). The primary resolution
> should be, IIRC, the latest file wins. However, this poses potential
> problems.
>
> Consider this scenario:
> Primary crashes. We only have the secondary. Files on the secondary change
> while it is the only server. Primary comes back, but crashes again
> mid-sync. Next time it comes back, it has a partially synced file, and it's
> favorite-child, so unless the metadata (specifically timestamps) gets
> synced _last_, the partial synced file would clobber the whole file. Does
> the metadata get synced last? It's the only sane option as far as I can
> tell, but I've seen situations before where the timestamps on the new
> server get stuck to epoch (01-01-1970) after a (successful) resync.
>
> Can somebody point at a definitive spec document for how the AFR healing is
> _supposed_ to operate under various failure and resync scenarios? It
> currently seems to be in quite a dangerous state and nowhere nearly enough
> warnings are being made about it for something that can cause extensive
> data corruption/loss. If such a specification exist, then it should be
> pretty easy to create test cases for it. Speaking of which, is there a
> test-harness available for it? It would be really useful to be able to do
> something like "make test" before "make install". It would also encourage
> more technical users to add test cases for things they find broken. It
> would also provide a base-line for regressions, to make sure that something
> that worked is never broken in a later release. My perception is that the
> stability/bug-count has been getting progressively worse in all releases
> since rc1.
>
> Another thing - since files being in sync is such a problematic thing at
> the moment, how about md5 and last-sync-timestamp fields in the metadata
> for each file? This, coupled with an external cron job that
> computes/verifies/updates these that can run [daily|weekly|monthly]
> (depending on the amount of data) would at least provide a secondary sanity
> check to make sure file corruption/de-sync gets detected early and
> reliably. Not having such a thing is really just sticking one's head in the
> sand and ignoring the issue.
>
> Another thing - if a file is open for write, I think there should be a
> metadata flag set, and it should be unset when the last write handle is
> closed. When the server comes up, if there are any such flags are set
> before any write opens are received, then the file should be marked as
> crashes, and this file should explicitly be prevented from being the
> sync-source. There are a lot of error-resync use cases and it might be a
> good time for them to be enumerated and systematically tested against to
> minimize risk of data loss.
>
> Gordan
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxx
> http://lists.nongnu.org/mailman/listinfo/gluster-devel