Re: OSD upgrades

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Correct, "crush weight" and normal "reweight" are indeed very different.
The original post mentions "rebuilding" servers, in this case the correct
way is to use "destroy" and then explicitly re-use the OSD afterwards.

purge is really only for OSDs that you don't get back (or broken disks that
you don't replace quickly)


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Tue, Jun 2, 2020 at 12:32 PM Thomas Byrne - UKRI STFC <
tom.byrne@xxxxxxxxxx> wrote:

> As you have noted, 'ceph osd reweight 0' is the same as an 'ceph osd out',
> but it is not the same as removing the OSD from the crush map (or setting
> crush weight to 0). This explains your observation of the double rebalance
> when you mark an OSD out (or reweight an OSD to 0), and then remove it
> later.
>
> To avoid this, I use a crush reweight for the initial step to move PGs off
> an OSD when draining nodes. You can then purge the OSD with no further PG
> movement.
>
> Double movement:
> > ceph osd out $i
> # rebalancing
> > ceph osd purge $i
> # more rebalancing
>
> Single movement:
> > ceph osd crush reweight $i 0
> # rebalancing
> > ceph osd purge $i
> # no rebalancing
>
> The reason this occurs (as I understand it) is that the reweight value is
> taken into account later in the crush calc, so an OSD with a reweight of 0
> can still be picked for a PG set, and then the reweight kicks in and forces
> the calc to be retried, giving a different value for the PG set compared to
> if the OSD was not present, or had a crush weight of 0.
>
> Cheers,
> Tom
>
> > -----Original Message-----
> > From: Brent Kennedy <bkennedy@xxxxxxxxxx>
> > Sent: 02 June 2020 04:44
> > To: 'ceph-users' <ceph-users@xxxxxxx>
> > Subject:  OSD upgrades
> >
> > We are rebuilding servers and before luminous our process was:
> >
> >
> >
> > 1.       Reweight the OSD to 0
> >
> > 2.       Wait for rebalance to complete
> >
> > 3.       Out the osd
> >
> > 4.       Crush remove osd
> >
> > 5.       Auth del osd
> >
> > 6.       Ceph osd rm #
> >
> >
> >
> > Seems the luminous documentation says that you should:
> >
> > 1.       Out the osd
> >
> > 2.       Wait for the cluster rebalance to finish
> >
> > 3.       Stop the osd
> >
> > 4.       Osd purge #
> >
> >
> >
> > Is reweighting to 0 no longer suggested?
> >
> >
> >
> > Side note:  I tried our existing process and even after reweight, the
> entire
> > cluster restarted the balance again after step 4 ( crush remove osd ) of
> the old
> > process.  I should also note, by reweighting to 0, when I tried to run
> "ceph osd
> > out #", it said it was already marked out.
> >
> >
> >
> > I assume the docs are correct, but just want to make sure since
> reweighting
> > had been previously recommended.
> >
> >
> >
> > Regards,
> >
> > -Brent
> >
> >
> >
> > Existing Clusters:
> >
> > Test: Nautilus 14.2.2 with 3 osd servers, 1 mon/man, 1 gateway, 2 iscsi
> > gateways ( all virtual on nvme )
> >
> > US Production(HDD): Nautilus 14.2.2 with 11 osd servers, 3 mons, 4
> gateways,
> > 2 iscsi gateways
> >
> > UK Production(HDD): Nautilus 14.2.2 with 12 osd servers, 3 mons, 4
> gateways
> >
> > US Production(SSD): Nautilus 14.2.2 with 6 osd servers, 3 mons, 3
> gateways,
> > 2 iscsi gateways
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to
> > ceph-users-leave@xxxxxxx
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses. Opinions, conclusions or other information in this
> message and attachments that are not related directly to UKRI business are
> solely those of the author and do not represent the views of UKRI.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux