Correct, "crush weight" and normal "reweight" are indeed very different. The original post mentions "rebuilding" servers, in this case the correct way is to use "destroy" and then explicitly re-use the OSD afterwards. purge is really only for OSDs that you don't get back (or broken disks that you don't replace quickly) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Tue, Jun 2, 2020 at 12:32 PM Thomas Byrne - UKRI STFC < tom.byrne@xxxxxxxxxx> wrote: > As you have noted, 'ceph osd reweight 0' is the same as an 'ceph osd out', > but it is not the same as removing the OSD from the crush map (or setting > crush weight to 0). This explains your observation of the double rebalance > when you mark an OSD out (or reweight an OSD to 0), and then remove it > later. > > To avoid this, I use a crush reweight for the initial step to move PGs off > an OSD when draining nodes. You can then purge the OSD with no further PG > movement. > > Double movement: > > ceph osd out $i > # rebalancing > > ceph osd purge $i > # more rebalancing > > Single movement: > > ceph osd crush reweight $i 0 > # rebalancing > > ceph osd purge $i > # no rebalancing > > The reason this occurs (as I understand it) is that the reweight value is > taken into account later in the crush calc, so an OSD with a reweight of 0 > can still be picked for a PG set, and then the reweight kicks in and forces > the calc to be retried, giving a different value for the PG set compared to > if the OSD was not present, or had a crush weight of 0. > > Cheers, > Tom > > > -----Original Message----- > > From: Brent Kennedy <bkennedy@xxxxxxxxxx> > > Sent: 02 June 2020 04:44 > > To: 'ceph-users' <ceph-users@xxxxxxx> > > Subject: OSD upgrades > > > > We are rebuilding servers and before luminous our process was: > > > > > > > > 1. Reweight the OSD to 0 > > > > 2. Wait for rebalance to complete > > > > 3. Out the osd > > > > 4. Crush remove osd > > > > 5. Auth del osd > > > > 6. Ceph osd rm # > > > > > > > > Seems the luminous documentation says that you should: > > > > 1. Out the osd > > > > 2. Wait for the cluster rebalance to finish > > > > 3. Stop the osd > > > > 4. Osd purge # > > > > > > > > Is reweighting to 0 no longer suggested? > > > > > > > > Side note: I tried our existing process and even after reweight, the > entire > > cluster restarted the balance again after step 4 ( crush remove osd ) of > the old > > process. I should also note, by reweighting to 0, when I tried to run > "ceph osd > > out #", it said it was already marked out. > > > > > > > > I assume the docs are correct, but just want to make sure since > reweighting > > had been previously recommended. > > > > > > > > Regards, > > > > -Brent > > > > > > > > Existing Clusters: > > > > Test: Nautilus 14.2.2 with 3 osd servers, 1 mon/man, 1 gateway, 2 iscsi > > gateways ( all virtual on nvme ) > > > > US Production(HDD): Nautilus 14.2.2 with 11 osd servers, 3 mons, 4 > gateways, > > 2 iscsi gateways > > > > UK Production(HDD): Nautilus 14.2.2 with 12 osd servers, 3 mons, 4 > gateways > > > > US Production(SSD): Nautilus 14.2.2 with 6 osd servers, 3 mons, 3 > gateways, > > 2 iscsi gateways > > > > > > > > > > > > > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > email to > > ceph-users-leave@xxxxxxx > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. Opinions, conclusions or other information in this > message and attachments that are not related directly to UKRI business are > solely those of the author and do not represent the views of UKRI. > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx