How to re-sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chad, Stephan - thank you for your feedback.

Just to clarify on what wrote, do you mean to say that -

1) The setup is a replicate setup with the file being written to multiple nodes.
2) One of these nodes is brought down.
3) A replicated file with a copy on the node brought down is written to.
4) The other copies are updates as writes  happen while this node is still down.
5) After this node is brought up, the client sometimes sees the old file on the node brought up
instead of picking the file from a node that has the latest copy.

If the above is correct, quick questions -

1) What versions are you using ?
2) Can you share your volume files ? Are they generated using volgen ? 
3) Did you notice any patterns for the files where the wrong copy was picked ? like 
were they open when the node was brought down ?
4) Any other way to reproduce the problem ?
5) Any other patterns you observed when you see the problem ?
6) Would you have listings of problem file(s) from the replica nodes ?

If however my understanding was not  correct, then please let me know with some
examples.

Regards,
Tejas.

----- Original Message -----
From: "Chad" <ccolumbu at hotmail.com>
To: "Stephan von Krawczynski" <skraw at ithnet.com>
Cc: gluster-users at gluster.org
Sent: Sunday, March 7, 2010 9:32:27 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
Subject: Re: How to re-sync

I actually do prefer top post.

Well this "overwritten" behavior is what I saw as well and that is a REALLY REALLY bad thing.
Which is why I asked my question in the first place.

Is there a gluster developer out there working on this problem specifically?
Could we add some kind of "sync done" command that has to be run manually and until it is the failed node is not used?
The bottom line for me is that I would much rather run on a performance degraded array until a sysadmin intervenes, than loose any data.

^C



Stephan von Krawczynski wrote:
> I love top-post ;-)
> 
> Generally, you are right. But in real-life you cannot trust on this
> "smartness". We tried exactly this point and had to find out that the clients
> do not always select the correct file version (i.e. the latest) automatically.
> Our idea in the testcase was to bring down a node, update its kernel an revive
> it - just as you would like to do it in real world for a kernel update.
> We found out that some files were taken from the downed node afterwards and
> the new contents on the other node got in fact overwritten.
> This does not happen generally, of course. But it does happen. We could only
> stop this behaviour by setting "favorite-child". But that does not really help
> a lot, since we want to take down all nodes some other day.
> This is in fact one of our show-stoppers.
> 
> 
> On Sun, 7 Mar 2010 01:33:14 -0800
> Liam Slusser <lslusser at gmail.com> wrote:
> 
>> Assuming you used raid1 (distribute), you DO bring up the new machine
>> and start gluster.  On one of your gluster mounts you run a ls -alR
>> and it will resync the new node.  The gluster clients are smart enough
>> to get the files from the first node.
>>
>> liam
>>
>> On Sat, Mar 6, 2010 at 11:48 PM, Chad <ccolumbu at hotmail.com> wrote:
>>> Ok, so assuming you have N glusterfsd servers (say 2 cause it does not
>>> really matter).
>>> Now one of the servers dies.
>>> You repair the machine and bring it back up.
>>>
>>> I think 2 things:
>>> 1. You should not start glusterfsd on boot (you need to sync the HD first)
>>> 2. When it is up how do you re-sync it?
>>>
>>> Do you rsync the underlying mount points?
>>> If it is a busy gluster cluster it will be getting new files all the time.
>>> So how do you sync and bring it back up safely so that clients don't connect
>>> to an incomplete server?
>>>
>>> ^C
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
> 
> 
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux