Re: What to expect on rejoining a host to cluster?

Matt Larson <larsonmattr@xxxxxxxxx> · Mon, 5 Dec 2022 19:36:47 -0500

Frank,

 Then if you have only a few OSDs with excessive PG counts / usage, do you
reweight it down by something like 10-20% to acheive a better distribution
and improve capacity?  Do weight it back to normal after PGs have moved?

 I wondered if manually picking on some of the higher data usage OSDs could
get to a gold outcome and avoid continous rebalancing or other issues.

 Thanks,
   Matt

On Mon, Dec 5, 2022 at 4:32 AM Frank Schilder <frans@xxxxxx> wrote:

> Hi Matt,
>
> I can't comment on balancers, I don't use them. I manually re-weight OSDs,
> which fits well with our pools' OSD allocation. Also, we don't aim for
> perfect balance, we just remove the peak of allocation on the fullest few
> OSDs to avoid excessive capacity loss. Not balancing too much has the pro
> of being fairly stable under OSD failures/additions at the expanse of a few
> % less capacity.
>
> Maybe someone else an help here?
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Matt Larson <larsonmattr@xxxxxxxxx>
> Sent: 04 December 2022 02:00:11
> To: Eneko Lacunza
> Cc: Frank Schilder; ceph-users
> Subject: Re:  Re: What to expect on rejoining a host to
> cluster?
>
> Thank you Frank and Eneko,
>
>  Without help and support from ceph admins like you, I would be adrift.  I
> really appreciate this.
>
>  I rejoined the host now one week ago, and the cluster has been dealing
> with the misplaced objects and recovering well.
>
> I will use this strategy in the future:
>
> "If you consider replacing the host and all disks, get a new host first
> and give it the host name in the crush map. Just before you deploy the new
> host, simply purge all down OSDs in its bucket (set norebalance) and
> deploy. Then, the data movement is restricted to re-balancing to the new
> host.
>
> If you just want to throw out the old host, destroy the OSDs but keep the
> IDs intact (ceph osd destroy). Then, no further re-balancing will happen
> and you can re-use the OSD ids later when adding a new host. That's a
> stable situation from an operations point of view."
>
> Last question I have is that I am now seeing that some OSDs have uneven
> load of PGs, which balancer do you recommend and any caveats for how the
> balancer operations can affect/slow the cluster?
>
> Thanks,
>   Matt
>
> On Mon, Nov 28, 2022 at 2:23 AM Eneko Lacunza <elacunza@xxxxxxxxx<mailto:
> elacunza@xxxxxxxxx>> wrote:
> Hi Matt,
>
> Also, make sure that when rejoining host has correct time. I have seen
> clusters going down when rejoining hosts that were down for maintenance for
> various weeks and came in with datetime deltas of some months (no idea why
> that happened, I arrived with the firefighter team ;-) )
>
> Cheers
>
> El 27/11/22 a las 13:27, Frank Schilder escribió:
>
> Hi Matt,
>
> if you didn't touch the OSDs on that host, they will join and only objects
> that have been modified will actually be updated. Ceph keeps some basic
> history information and can detect changes. 2 weeks is not a very long
> time. If you have a lot of cold data, re-integration will go fast.
>
> Initially, you will see a huge amount of misplaced objects. However, this
> count will go down much faster than objects/s recovery.
>
> Before you rejoin the host, I would fix its issues though. Now that you
> have it out of the cluster, do the maintenance first. There is no rush. In
> fact, you can buy a new host, install the OSDs in the new one and join that
> to the cluster with the host-name of the old host.
>
> If you consider replacing the host and all disks, the get a new host first
> and give it the host name in the crush map. Just before you deploy the new
> host, simply purge all down OSDs in its bucket (set norebalance) and
> deploy. Then, the data movement is restricted to re-balancing to the new
> host.
>
> If you just want to throw out the old host, destroy the OSDs but keep the
> IDs intact (ceph osd destroy). Then, no further re-balancing will happen
> and you can re-use the OSD ids later when adding a new host. That's a
> stable situation from an operations point of view.
>
> Hope that helps.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Matt Larson <larsonmattr@xxxxxxxxx><mailto:larsonmattr@xxxxxxxxx>
> Sent: 26 November 2022 21:07:41
> To: ceph-users
> Subject:  What to expect on rejoining a host to cluster?
>
> Hi all,
>
>  I have had a host with 16 OSDs, each 14TB in capacity that started having
> hardware issues causing it to crash.  I took this host down 2 weeks ago,
> and the data rebalanced to the remaining 11 server hosts in the Ceph
> cluster over this time period.
>
>  My initial goal was to then remove the host completely from the cluster
> with `ceph osd rm XX` and `ceph osd purge XX` (Adding/Removing OSDs — Ceph
> Documentation
> <https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/><
> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/>).
> However, I found that after the large amount of data migration from the
> recovery, that the purge and removal from the crush map for an OSDs still
> required another large data move.  It appears that it would have been a
> better strategy to assign a 0 weight to an OSD to have only a single larger
> data move instead of twice.
>
>  I'd like to join the downed server back into the Ceph cluster.  It still
> has 14 OSDs that are listed as out/down that would be brought back online.
> My question is what can I expect if I bring this host online?  Will the
> OSDs of a host that has been offline for an extended period of time and out
> of the cluster have PGs that are now quite different or inconsistent?  Will
> this be problematic?
>
>  Thanks for any advice,
>    Matt
>
> --
> Matt Larson, PhD
> Madison, WI  53705 U.S.A.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:
> ceph-users-leave@xxxxxxx>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:
> ceph-users-leave@xxxxxxx>
>
>
>
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
>
> Tel. +34 943 569 206 | https://www.binovo.es
> Astigarragako Bidea, 2 - 2º
> <https://www.google.com/maps/search/Astigarragako+Bidea,+2+-+2%C2%BA?entry=gmail&source=g>
> izda. Oficina 10-11, 20180 Oiartzun
>
> https://www.youtube.com/user/CANALBINOVO
> https://www.linkedin.com/company/37269706/
>
>
> --
> Matt Larson, PhD
> Madison, WI  53705 U.S.A.
>
-- 
Matt Larson, PhD
Madison, WI  53705 U.S.A.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx