It's probably worth noting that if you're planning on removing multiple OSDs in this manner, you should make sure they are not in the same failure domain, per your CRUSH rules. For example, if you keep one replica per node and three copies (as in the default) and remove OSDs from multiple nodes without marking them as out first, you risk losing data if they are in the same placement group, depending on the number of replicas you have and the number of OSDs you simultaneously remove. That said, it would be safe in the above scenario to remove multiple OSDs from a single node simultaneously, since the CRUSH rules aren't placing multiple replicas on the same host. -Steve On 11/30/2015 04:33 AM, Wido den Hollander wrote: > > On 30-11-15 10:08, Carsten Schmitt wrote: >> Hi all, >> >> I'm running ceph version 0.94.5 and I need to downsize my servers >> because of insufficient RAM. >> >> So I want to remove OSDs from the cluster and according to the manual >> it's a pretty straightforward process: >> I'm beginning with "ceph osd out {osd-num}" and the cluster starts >> rebalancing immediately as expected. After the process is finished, the >> rest should be quick: >> Stop the daemon "/etc/init.d/ceph stop osd.{osd-num}" and remove the OSD >> from the crush map: "ceph osd crush remove {name}" >> >> But after entering the last command, the cluster starts rebalancing again. >> >> And that I don't understand: Shouldn't be one rebalancing process enough >> or am I missing something? >> > Well, for CRUSH this are two different things. First, the weight of the > node goes to 0 (zero), but it's still a part of the CRUSH map. > > Say, there are still 5 OSDS on that host, 4 with a weight of X and one > with a weight of zero. > > When you remove the OSD, there are only 4 OSDs left, that's a change for > CRUSH. > > What you should do in this case. Only remove the OSD from CRUSH and > don't mark it as out. > > When the cluster is done you can mark it out, but that won't cause a > rebalance since it's already out of the CRUSH map. > > It will still work with the other OSDs to migrate the data since the > cluster knows it had that PG information. > >> My config is pretty vanilla, except for: >> [osd] >> osd recovery max active = 4 >> osd max backfills = 4 >> >> Thanks in advance, >> Carsten >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Steve Anthony LTS HPC Support Specialist Lehigh University sma310@xxxxxxxxxx
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com