> Hello all, > DHT's remove-brick + rebalance has been enhanced in the last couple of > releases to be quite sophisticated. It can handle graceful decommissioning > of bricks, including open file descriptors and hard links. > > Last set of patches for this should be reviewed and accepted before we make that claim :-) [ http://review.gluster.org/5891 ] > This in a way is a feature overlap with replace-brick's data migration > functionality. Replace-brick's data migration is currently also used for > planned decommissioning of a brick. > > Reasons to remove replace-brick (or why remove-brick is better): > > - There are two methods of moving data. It is confusing for the users and > hard for developers to maintain. > > - If server being replaced is a member of a replica set, neither > remove-brick nor replace-brick data migration is necessary, because > self-healing itself will recreate the data (replace-brick actually uses > self-heal internally) > > - In a non-replicated config if a server is getting replaced by a new one, > add-brick <new> + remove-brick <old> "start" achieves the same goal as > replace-brick <old> <new> "start". > > Should we phase out CLI of doing a 'remove-brick' without any option too? because even if users do it by mistake, they would loose data. We should enforce 'start' and then 'commit' usage of remove-brick. Also if old method is required for anyone, they anyways have 'force' option. > - In a non-replicated config, <replace-brick> is NOT glitch free > (applications witness ENOTCONN if they are accessing data) whereas > add-brick <new> + remove-brick <old> is completely transparent. > > +10 (thats the number of bugs open on these things :-) > - Replace brick strictly requires a server with enough free space to hold > the data of the old brick, whereas remove-brick will evenly spread out the > data of the bring being removed amongst the remaining servers. > > - Replace-brick code is complex and messy (the real reason :p). > > Wanted to see this reason as 1st point, but its ok as long as we mention about this. I too agree that its _hard_ to maintain that piece of code. > - No clear reason why replace-brick's data migration is better in any way > to remove-brick's data migration. > > One reason I heard when I sent the mail on gluster-devel earlier ( http://lists.nongnu.org/archive/html/gluster-devel/2012-10/msg00050.html ) was that the remove-brick way was bit slower than that of replace-brick. Technical reason being remove-brick does DHT's readdir, where as replace-brick does the brick level readdir. > I plan to send out patches to remove all traces of replace-brick data > migration code by 3.5 branch time. > > Thanks for the initiative, let me know if you need help. > NOTE that replace-brick command itself will still exist, and you can > replace on server with another in case a server dies. It is only the data > migration functionality being phased out. > > Yes, we need to be careful about this. We would need 'replace-brick' to phase out a dead brick. The other day, there was some discussion on have 'gluster peer replace <old-peer> <new-peer>' which would re-write all the vol files properly. But thats mostly for 3.6 time frame IMO. > Please do ask any questions / raise concerns at this stage :) > > > What is the window before you start sending out patches ?? I see http://review.gluster.org/6010 which I guess is not totally complete without phasing out pump xlator :-) I personally am all in for this change, as it helps me to finish few more enhancements I am working on like 'discover()' changes etc... Regards, Amar -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130927/b546f5ea/attachment.html>