Re: Removing OSD - double rebalance?

Steve Anthony <sma310@xxxxxxxxxx> · Mon, 30 Nov 2015 15:52:54 -0500

It's probably worth noting that if you're planning on removing multiple
OSDs in this manner, you should make sure they are not in the same
failure domain, per your CRUSH rules. For example, if you keep one
replica per node and three copies (as in the default) and remove OSDs
from multiple nodes without marking them as out first, you risk losing
data if they are in the same placement group, depending on the number of
replicas you have and the number of OSDs you simultaneously remove.

That said, it would be safe in the above scenario to remove multiple
OSDs from a single node simultaneously, since the CRUSH rules aren't
placing multiple replicas on the same host.

-Steve  

On 11/30/2015 04:33 AM, Wido den Hollander wrote:
>
> On 30-11-15 10:08, Carsten Schmitt wrote:
>> Hi all,
>>
>> I'm running ceph version 0.94.5 and I need to downsize my servers
>> because of insufficient RAM.
>>
>> So I want to remove OSDs from the cluster and according to the manual
>> it's a pretty straightforward process:
>> I'm beginning with "ceph osd out {osd-num}" and the cluster starts
>> rebalancing immediately as expected. After the process is finished, the
>> rest should be quick:
>> Stop the daemon "/etc/init.d/ceph stop osd.{osd-num}" and remove the OSD
>> from the crush map: "ceph osd crush remove {name}"
>>
>> But after entering the last command, the cluster starts rebalancing again.
>>
>> And that I don't understand: Shouldn't be one rebalancing process enough
>> or am I missing something?
>>
> Well, for CRUSH this are two different things. First, the weight of the
> node goes to 0 (zero), but it's still a part of the CRUSH map.
>
> Say, there are still 5 OSDS on that host, 4 with a weight of X and one
> with a weight of zero.
>
> When you remove the OSD, there are only 4 OSDs left, that's a change for
> CRUSH.
>
> What you should do in this case. Only remove the OSD from CRUSH and
> don't mark it as out.
>
> When the cluster is done you can mark it out, but that won't cause a
> rebalance since it's already out of the CRUSH map.
>
> It will still work with the other OSDs to migrate the data since the
> cluster knows it had that PG information.
>
>> My config is pretty vanilla, except for:
>> [osd]
>> osd recovery max active = 4
>> osd max backfills = 4
>>
>> Thanks in advance,
>> Carsten
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Steve Anthony
LTS HPC Support Specialist
Lehigh University
sma310@xxxxxxxxxx

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com