Re: [Gluster-users] Phasing out replace-brick for data migration in favor of remove-brick.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Inline response.

On 09/27/2013 02:26 PM, James wrote:
On Fri, 2013-09-27 at 00:35 -0700, Anand Avati wrote:
Hello all,
Hey,

Interesting timing for this post...
I've actually started working on automatic brick addition/removal. (I'm
planning to add this to puppet-gluster of course.) I was hoping you
could help out with the algorithm. I think it's a bit different if
there's no replace-brick command as you are proposing.

Here's the problem:
Given a logically optimal initial volume:

volA: rep=2; h1:/b1 h2:/b1 h3:/b1 h4:/b1 h1:/b2 h2:/b2 h3:/b2 h4:/b2

suppose I know that I want to add/remove bricks such that my new volume
(if I had created it new) looks like:

volB: rep=2; h1:/b1 h3:/b1 h4:/b1 h5:/b1 h6:/b1 h1:/b2 h3:/b2 h4:/b2
h5:/b2 h6:/b2

What is the optimal algorithm for determining the correct sequence of
transforms that are needed to accomplish this task. Obviously there are
some simpler corner cases, but I'd like to solve the general case.

The transforms are obviously things like running the add-brick {...} and
remove-brick {...} commands.

This is the exact reason why we recommend in our best practice to have a directory inside a mountpoint exported as a brick, in this case, h1:/b1/d1 (where d1 is a directory inside mountpoint /b1).

This helps in having a brick h1:/b1/d2 which is technically the same thing you would like to have in VolB.

Also, it is never good to swap/change/move replica pairs to different sets... would lead into many issues, like duplicate files, etc etc..



- Replace brick strictly requires a server with enough free space to hold
the data of the old brick, whereas remove-brick will evenly spread out the
data of the bring being removed amongst the remaining servers.
Can you talk more about the replica = N case (where N is 2 or 3?)
With remove brick, add brick you will need add/remove N (replica count)
bricks at a time, right? With replace brick, you could just swap out
one, right? Isn't that a missing feature if you remove replace brick?
For that particular swapping without data migration, you will still have 'replace-brick' existing. What it does is replace an existing brick of a replica pair with an empty brick, so replicate's self-heal daemon populates the data in it.

Please do ask any questions / raise concerns at this stage :)
I heard with 3.4 you can somehow change the replica count when adding
new bricks... What's the full story here please?


Yes, support in CLI for this existed with glusterfs-3.3.x (http://review.gluster.com/158) itself, just that there are few bugs.

syntax of add-brick:

gluster volume add-brick <VOLNAME> [<stripe|replica> <COUNT>] <NEW-BRICK> ... [force] - add brick to volume <VOLNAME>

If you give 'replica N' where N is already existing replica count -1/+1.

Regards,
Amar



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux