Re: Proper procedure for osd/host removal

Dinu Vlad <dinuvlad13@xxxxxxxxx> · Mon, 15 Dec 2014 21:31:43 +0200

Thanks - I was suspecting it. I was thinking at a course of action that would allow setting the weight of an entire host to zero in the crush map - thus forcing the migration of the data out of the OSDs of that host, followed by the crush and osd removal, one by one (hopefully this time without another backfill session).  

Problem is I don't have where to test how that would work and/or what would be the side-effects (if any). 

On 15 Dec 2014, at 21:07, Adeel Nazir <adeel@xxxxxxxxx> wrote:

> I'm going through something similar, and it seems like the double backfill you're experiencing is about par for the course. According to the CERN presentation (http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern slide 19), doing a 'ceph osd crush rm osd <ID>' should save the double backfill, but I haven't experienced that in my 0.80.5 cluster. Even after I do a crush rm osd, and finally remove it via ceph rm osd.<ID>, it computes a new map and does the backfill again. As far as I can tell, there's no way around it without editing the map manually, making whatever changes you require and then pushing the new map. I personally am not experienced enough to feel comfortable making that kind of a change.
> 
> 
> Adeel
> 
>> -----Original Message-----
>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
>> Dinu Vlad
>> Sent: Monday, December 15, 2014 11:35 AM
>> To: ceph-users@xxxxxxxxxxxxxx
>> Subject:  Proper procedure for osd/host removal
>> 
>> Hello,
>> 
>> I've been working to upgrade the hardware on a semi-production ceph
>> cluster, following the instructions for OSD removal from
>> http://ceph.com/docs/master/rados/operations/add-or-rm-
>> osds/#removing-osds-manual. Basically, I've added the new hosts to the
>> cluster and now I'm removing the old ones from it.
>> 
>> What I found curious is that after the sync triggered by the "ceph osd out
>> <id>" finishes and I stop the osd process and remove it from the crush map,
>> another session of synchronization is triggered - sometimes this one takes
>> longer than the first. Also, removing an empty "host" bucket from the crush
>> map triggred another resynchronization.
>> 
>> I noticed that the overall weight of the host bucket does not change in the
>> crush map as a result of one OSD being "out", therefore what is happening is
>> kinda' normal behavior - however it remains time-consuming. Is there
>> something that can be done to avoid the double resync?
>> 
>> I'm running 0.72.2 on top of ubuntu 12.04 on the OSD hosts.
>> 
>> Thanks,
>> Dinu
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com