2009/9/15 Stephan von Krawczynski <skraw@xxxxxxxxxx>
Can we all read this in relation to extended attributes and the cluster/replicate translator. Understanding AFR translator
Also, if you want more reliable restores directly to a single storage brick (rather than restoring onto a replicate translator) I would suggest you have a backup system that handles extended attributes. I am using bacula for this purpose, but you may find other solutions that fit.
Regards,
Michael Cassaniti
No, we explicitly did not sync the extended attributes. But your questionOn Mon, 14 Sep 2009 21:20:49 +0200
"Steve" <steeeeeveee@xxxxxxx> wrote:
>
> -------- Original-Nachricht --------
> > Datum: Mon, 14 Sep 2009 21:14:32 +0200
> > Von: Stephan von Krawczynski <skraw@xxxxxxxxxx>
> > An: Anand Avati <avati@xxxxxxxxxxx>
> > CC: gluster-devel@xxxxxxxxxx
> > Betreff: Re: solutions for split brain situation
>
> > On Mon, 14 Sep 2009 21:44:12 +0530
> > Anand Avati <avati@xxxxxxxxxxx> wrote:
> >
> > > > Our "split brain" is no real split brain and looks like this: Logfiles
> > are
> > > > written every 5 mins. If you add a secondary server that has 14 days
> > old
> > > > logfiles on it you notice that about half of your data vanishes while
> > not
> > > > successful self heal is performed, because the old logfiles read from
> > the
> > > > secondary server overwrite the new logfiles on your primary while new
> > data is
> > > > added to them.
> > >
> > > Have you been using favorite-child option?
> >
> > No, the option was not used.
> >
> > > Auto resolving of
> > > split-brain is bound to make you lose data of one of the subvolumes.
> > > If you had indeed specified favorite-child option, and the
> > > favorite-child option happens to be the server which had 14day old
> > > logs, what just happened was exactly what was in the elaborate warning
> > > log.
> > >
> > > Now what is more interesting for me is, the sequence of taking down
> > > and bringing up the servers you followed to split brain? Was is really
> > > just taking one server (any of them) down and bringing it back up? Did
> > > you face a split brain with just this? Can you please describe the
> > > minimal steps necessary to reproduce your issue?
> >
> > Take 2 servers and one client. Use a minimal replicate setup but do _not_
> > add
> > the second server. Copy some data on the first server via glusterfs, then
> > rsync that data on the second server directly from the first server
> > (glusterfsd not yet active there). Now change some of the data to have
> > files
> > that are really newer as your rsync cycle. Then start glusterfsd on the
> > second
> > server. Your client will add it. Then open the newer files r/w on the
> > client.
> > You will notice the split brain messages in the client logs and find that
> > every
> > other file gets indeed read in from the second (outdated) server fileset.
> > Write it back and your newer files on the first server are gone.
> > As said, no favorite child option set.
> >
> You just rsynced but did you synced the extended attributes as well?
should be placed more general: if I have a working glusterfs server, must all
data be backuped including extended attributes?
Why should it be lethal not to backup them, when I can get data online by
simply starting to export it via glusterfsd that has not been touched by
glusterfsd before? (think of a first-time export, you have some data and
install glusterfs for the very first time. Your data is of course exported
without any troubles. Where is the difference to a rsync backup with no
extended attributes?
Can we all read this in relation to extended attributes and the cluster/replicate translator. Understanding AFR translator
Also, if you want more reliable restores directly to a single storage brick (rather than restoring onto a replicate translator) I would suggest you have a backup system that handles extended attributes. I am using bacula for this purpose, but you may find other solutions that fit.
Regards,
Michael Cassaniti