Hi, I'd also like to put my vote in for a public specification document containing the details of how AFR handles various failure and recovery scenarios. (Thanks for raising this, Gordan.) Kind regards, Geoff Kassel. On Fri, 3 Apr 2009, Gordan Bobic wrote: > On Fri, 3 Apr 2009 13:34:29 +0200, nicolas prochazka > > <prochazka.nicolas@xxxxxxxxx> wrote: > > It seems to be have a lot of problem with self healing and one it is > > that glusterfs is using one server a reference ( the first in > > subvolumes ) > > ( afr_sh_select_source ? ) > > This brings up an interesting point - what is the conflict resolution > supposed to be? The favorite-child option should be the resolution of last > resort (i.e. the timestamp metadata is identical). The primary resolution > should be, IIRC, the latest file wins. However, this poses potential > problems. > > Consider this scenario: > Primary crashes. We only have the secondary. Files on the secondary change > while it is the only server. Primary comes back, but crashes again > mid-sync. Next time it comes back, it has a partially synced file, and it's > favorite-child, so unless the metadata (specifically timestamps) gets > synced _last_, the partial synced file would clobber the whole file. Does > the metadata get synced last? It's the only sane option as far as I can > tell, but I've seen situations before where the timestamps on the new > server get stuck to epoch (01-01-1970) after a (successful) resync. > > Can somebody point at a definitive spec document for how the AFR healing is > _supposed_ to operate under various failure and resync scenarios? It > currently seems to be in quite a dangerous state and nowhere nearly enough > warnings are being made about it for something that can cause extensive > data corruption/loss. If such a specification exist, then it should be > pretty easy to create test cases for it. Speaking of which, is there a > test-harness available for it? It would be really useful to be able to do > something like "make test" before "make install". It would also encourage > more technical users to add test cases for things they find broken. It > would also provide a base-line for regressions, to make sure that something > that worked is never broken in a later release. My perception is that the > stability/bug-count has been getting progressively worse in all releases > since rc1. > > Another thing - since files being in sync is such a problematic thing at > the moment, how about md5 and last-sync-timestamp fields in the metadata > for each file? This, coupled with an external cron job that > computes/verifies/updates these that can run [daily|weekly|monthly] > (depending on the amount of data) would at least provide a secondary sanity > check to make sure file corruption/de-sync gets detected early and > reliably. Not having such a thing is really just sticking one's head in the > sand and ignoring the issue. > > Another thing - if a file is open for write, I think there should be a > metadata flag set, and it should be unset when the last write handle is > closed. When the server comes up, if there are any such flags are set > before any write opens are received, then the file should be marked as > crashes, and this file should explicitly be prevented from being the > sync-source. There are a lot of error-resync use cases and it might be a > good time for them to be enumerated and systematically tested against to > minimize risk of data loss. > > Gordan > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > http://lists.nongnu.org/mailman/listinfo/gluster-devel