Re: replacing OSD nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I don't have many comments on your proposed approach, but just wanted
to note that how I would have approached this, assuming that you have
the same number of old hosts, would be to:
1. Swap-bucket the hosts.
2. Downweight the OSDs on the old hosts to 0.001. (Marking them out
(i.e. weight 0) prevents maps from being applied.)
3. Add the old hosts back to the CRUSH map in their old racks or whatever.
4. Use https://github.com/digitalocean/pgremapper#cancel-backfill.
5. Then run https://github.com/digitalocean/pgremapper#undo-upmaps in
a loop to drain the old OSDs.

This gives you the maximum concurrency and efficiency of movement, but
doesn't necessarily solve your balance issue if it's the new OSDs that
are getting full (that wasn't clear to me). It's still possible to
apply steps 2, 4, and 5 if the new hosts are in place. If you're not
in a rush could actually use the balancer instead of undo-upmaps in
step 5 to perform the rest of the data migration and then you wouldn't
have full OSDs.

Josh

On Fri, Jul 22, 2022 at 1:57 AM Jesper Lykkegaard Karlsen
<jelka@xxxxxxxxx> wrote:
>
> It seems like a low hanging fruit to fix?
> There must be a reason why the developers have not made a prioritized order of backfilling PGs.
> Or maybe the prioritization is something else than available space?
>
> The answer remains unanswered, as well as if my suggested approach/script would work or not?
>
> Summer vacation?
>
> Best,
> Jesper
>
> --------------------------
> Jesper Lykkegaard Karlsen
> Scientific Computing
> Centre for Structural Biology
> Department of Molecular Biology and Genetics
> Aarhus University
> Universitetsbyen 81
> 8000 Aarhus C
>
> E-mail: jelka@xxxxxxxxx
> Tlf:    +45 50906203
>
> ________________________________
> Fra: Janne Johansson <icepic.dz@xxxxxxxxx>
> Sendt: 20. juli 2022 19:39
> Til: Jesper Lykkegaard Karlsen <jelka@xxxxxxxxx>
> Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
> Emne: Re:  replacing OSD nodes
>
> Den ons 20 juli 2022 kl 11:22 skrev Jesper Lykkegaard Karlsen <jelka@xxxxxxxxx>:
> > Thanks for you answer Janne.
> > Yes, I am also running "ceph osd reweight" on the "nearfull" osds, once they get too close for comfort.
> >
> > But I just though a continuous prioritization of rebalancing PGs, could make this process more smooth, with less/no need for handheld operations.
>
> You are absolutely right there, just wanted to chip in with my
> experiences of "it nags at me but it will still work out" so other
> people finding these mails later on can feel a bit relieved at knowing
> that a few toofull warnings aren't a major disaster and that it
> sometimes happens, because ceph looks for all possible moves, even
> those who will run late in the rebalancing.
>
> --
> May the most significant bit of your life be positive.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux