Re: Safe to move misplaced hosts between failure domains in the crush tree?

Torkil Svensgaard <torkil@xxxxxxxx> · Wed, 12 Jun 2024 11:53:54 +0200

On 12/06/2024 11:20, Matthias Grandl wrote:
Yeah that should work no problem.

In this case I would even recommend setting `norebalance` and using the 
trusty old upmap-remapped script (credits to Cern), to avoid unnecessary 
data movements: 
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py <https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py>

Worked like a charm with hardly any data movement. I used the pgremapper 
tool[1] just in case and now letting the balancer do its thing.

Cheers!

Thanks!

Mvh.

Torkil

[1] https://github.com/digitalocean/pgremapper

--

Matthias Grandl
Head Storage Engineer
matthias.grandl@xxxxxxxx <mailto:matthias.grandl@xxxxxxxx>

Looking for help with your Ceph cluster? Contact us at https://croit 
<https://croit>.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io ;| YouTube: https://goo.gl/PGE1Bx

On 12. Jun 2024, at 09:33, Torkil Svensgaard <torkil@xxxxxxxx> wrote:

On 12/06/2024 10:22, Matthias Grandl wrote:
Correct, this should only result in misplaced objects.
> We made a mistake when we moved the servers physically so while the 
replica 3 is intact the crush tree is not accurate.
Can you elaborate on that? Does this mean after the move, multiple 
hosts are inside the same physical datacenter? In that case, once you 
correct the CRUSH layout, you would be running misplaced without a 
way to rebalance pools that are you using a datacenter crush rule.

Hi Matthias

Thanks for replying. Two of the three hosts was swapped so I would do:

ceph osd crush move ceph-flash1 datacenter=HX1
ceph osd crush move ceph-flash2 datacenter=714

And end up with 2/3 misplaced:

 -1         4437.29248  root default
-33         1467.84814      datacenter 714
-69           69.86389          host ceph-flash2
-34         1511.25378      datacenter HX1
-73           69.86389          host ceph-flash1
-36         1458.19067      datacenter UXH
-77           69.86389          host ceph-flash3

It would only briefly be invalid between the two commands.

Mvh.

Torkil

Cheers!
--
Matthias Grandl
Head Storage Engineer
matthias.grandl@xxxxxxxx 
<mailto:matthias.grandl@xxxxxxxx><mailto:matthias.grandl@xxxxxxxx 
<mailto:matthias.grandl@xxxxxxxx>>
Looking for help with your Ceph cluster? Contact us 
athttps://www.google.com/url?q=https://croit&source=gmail-imap&ust=1718786013000000&usg=AOvVaw20IeQOFWA32aJbNYESGju_ <https://www.google.com/url?q=https://croit&source=gmail-imap&ust=1718786013000000&usg=AOvVaw20IeQOFWA32aJbNYESGju_><https://www.google.com/url?q=https://croit&source=gmail-imap&ust=1718786013000000&usg=AOvVaw20IeQOFWA32aJbNYESGju_ <https://www.google.com/url?q=https://croit&source=gmail-imap&ust=1718786013000000&usg=AOvVaw20IeQOFWA32aJbNYESGju_>>.io
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: 
https://www.google.com/url?q=https://croit.io&source=gmail-imap&ust=1718786013000000&usg=AOvVaw0xp2aiklN5gc6S8d5AQNDl <https://www.google.com/url?q=https://croit.io&source=gmail-imap&ust=1718786013000000&usg=AOvVaw0xp2aiklN5gc6S8d5AQNDl> | YouTube: https://www.google.com/url?q=https://goo.gl/PGE1Bx&source=gmail-imap&ust=1718786013000000&usg=AOvVaw34AhK3lh0-7mFgZUkp4v1g <https://www.google.com/url?q=https://goo.gl/PGE1Bx&source=gmail-imap&ust=1718786013000000&usg=AOvVaw34AhK3lh0-7mFgZUkp4v1g>
On 12. Jun 2024, at 09:13, Torkil Svensgaard <torkil@xxxxxxxx> wrote:

Hi

We have 3 servers for replica 3 with failure domain datacenter:

 -1         4437.29248  root default
-33         1467.84814      datacenter 714
-69           69.86389          host ceph-flash1
-34         1511.25378      datacenter HX1
-73           69.86389          host ceph-flash2
-36         1458.19067      datacenter UXH
-77           69.86389          host ceph-flash3

We made a mistake when we moved the servers physically so while the 
replica 3 is intact the crush tree is not accurate.

If we just remedy the situation with "ceph osd crush move 
ceph-flashX datacenter=Y" we will just end up with a lot of 
misplaced data and some churn, right? Or will the affected pool go 
degraded/unavailable?

Mvh.

Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: torkil@xxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail:torkil@xxxxxxxx <mailto:torkil@xxxxxxxx>

--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: torkil@xxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx