Hi Christpher, On 3/6/07, Christopher Hawkins <chawkins@xxxxxxxxxxxxxxxxxxxx> wrote:
The last fellow to post mentioned recovery... I have a question also: If I had several storage servers and a number of clients accessing them, and I were to lose a storage server, how best to bring it back online? I would be using AFR to keep multiple copies of all files, so I know the cluster will not lose data. But when the node goes down, does the AFR translator figure out by itself that instead of the 3x copies I specified, there are now only 2x because I lost a storage node? Or does it only evaluate that at file creation time?
AFR is nothing but implementation of open, read, write, getattr etc calls It calls these functions on its children, if the child is down, the function (from protocol/client) returns ENOTCONN to AFR which is ignored. So AFR does not care if a child is down/up, it is up to the child translator to pass on these calls to the servers if they are up.
And when I bring the storage node back, say it takes me two days to fix it, I assume I should probably wipe the drives so as not to introduce old copies of files that are now out of date (or does AFR update them)? And the ALU scheduler will start using the blank space more heavily for new writes, because it is preferred as "less used" and the storage use will eventually even out again?
As of now we do not have any tool to get the new machine to be updated with other AFR servers. It is on our task list.
Thanks for any answers! Chris _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxx http://lists.nongnu.org/mailman/listinfo/gluster-devel