On Mon, 14 Sep 2009 21:44:12 +0530 Anand Avati <avati@xxxxxxxxxxx> wrote: > > Our "split brain" is no real split brain and looks like this: Logfiles are > > written every 5 mins. If you add a secondary server that has 14 days old > > logfiles on it you notice that about half of your data vanishes while not > > successful self heal is performed, because the old logfiles read from the > > secondary server overwrite the new logfiles on your primary while new data is > > added to them. > > Have you been using favorite-child option? No, the option was not used. > Auto resolving of > split-brain is bound to make you lose data of one of the subvolumes. > If you had indeed specified favorite-child option, and the > favorite-child option happens to be the server which had 14day old > logs, what just happened was exactly what was in the elaborate warning > log. > > Now what is more interesting for me is, the sequence of taking down > and bringing up the servers you followed to split brain? Was is really > just taking one server (any of them) down and bringing it back up? Did > you face a split brain with just this? Can you please describe the > minimal steps necessary to reproduce your issue? Take 2 servers and one client. Use a minimal replicate setup but do _not_ add the second server. Copy some data on the first server via glusterfs, then rsync that data on the second server directly from the first server (glusterfsd not yet active there). Now change some of the data to have files that are really newer as your rsync cycle. Then start glusterfsd on the second server. Your client will add it. Then open the newer files r/w on the client. You will notice the split brain messages in the client logs and find that every other file gets indeed read in from the second (outdated) server fileset. Write it back and your newer files on the first server are gone. As said, no favorite child option set. > Avati -- Regards, Stephan