Re: Safe to move misplaced hosts between failure domains in the crush tree?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yeah that should work no problem.

In this case I would even recommend setting `norebalance` and using the trusty old upmap-remapped script (credits to Cern), to avoid unnecessary data movements: https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py

Cheers!
--

Matthias Grandl
Head Storage Engineer
matthias.grandl@xxxxxxxx <mailto:matthias.grandl@xxxxxxxx>

Looking for help with your Ceph cluster? Contact us at https://croit <https://croit/>.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

> On 12. Jun 2024, at 09:33, Torkil Svensgaard <torkil@xxxxxxxx> wrote:
> 
> 
> 
> On 12/06/2024 10:22, Matthias Grandl wrote:
>> Correct, this should only result in misplaced objects.
>> > We made a mistake when we moved the servers physically so while the replica 3 is intact the crush tree is not accurate.
>> Can you elaborate on that? Does this mean after the move, multiple hosts are inside the same physical datacenter? In that case, once you correct the CRUSH layout, you would be running misplaced without a way to rebalance pools that are you using a datacenter crush rule.
> 
> Hi Matthias
> 
> Thanks for replying. Two of the three hosts was swapped so I would do:
> 
> ceph osd crush move ceph-flash1 datacenter=HX1
> ceph osd crush move ceph-flash2 datacenter=714
> 
> 
> And end up with 2/3 misplaced:
> 
>  -1         4437.29248  root default
> -33         1467.84814      datacenter 714
> -69           69.86389          host ceph-flash2
> -34         1511.25378      datacenter HX1
> -73           69.86389          host ceph-flash1
> -36         1458.19067      datacenter UXH
> -77           69.86389          host ceph-flash3
> 
> It would only briefly be invalid between the two commands.
> 
> Mvh.
> 
> Torkil
> 
> 
>> Cheers!
>> --
>> Matthias Grandl
>> Head Storage Engineer
>> matthias.grandl@xxxxxxxx <mailto:matthias.grandl@xxxxxxxx> <mailto:matthias.grandl@xxxxxxxx>
>> Looking for help with your Ceph cluster? Contact us at https://www.google.com/url?q=https://croit&source=gmail-imap&ust=1718786013000000&usg=AOvVaw20IeQOFWA32aJbNYESGju_ <https://www.google.com/url?q=https://croit&source=gmail-imap&ust=1718786013000000&usg=AOvVaw20IeQOFWA32aJbNYESGju_>.io
>> croit GmbH, Freseniusstr. 31h, 81247 Munich
>> CEO: Martin Verges - VAT-ID: DE310638492
>> Com. register: Amtsgericht Munich HRB 231263
>> Web: https://www.google.com/url?q=https://croit.io&source=gmail-imap&ust=1718786013000000&usg=AOvVaw0xp2aiklN5gc6S8d5AQNDl | YouTube: https://www.google.com/url?q=https://goo.gl/PGE1Bx&source=gmail-imap&ust=1718786013000000&usg=AOvVaw34AhK3lh0-7mFgZUkp4v1g
>>> On 12. Jun 2024, at 09:13, Torkil Svensgaard <torkil@xxxxxxxx> wrote:
>>> 
>>> Hi
>>> 
>>> We have 3 servers for replica 3 with failure domain datacenter:
>>> 
>>>  -1         4437.29248  root default
>>> -33         1467.84814      datacenter 714
>>> -69           69.86389          host ceph-flash1
>>> -34         1511.25378      datacenter HX1
>>> -73           69.86389          host ceph-flash2
>>> -36         1458.19067      datacenter UXH
>>> -77           69.86389          host ceph-flash3
>>> 
>>> We made a mistake when we moved the servers physically so while the replica 3 is intact the crush tree is not accurate.
>>> 
>>> If we just remedy the situation with "ceph osd crush move ceph-flashX datacenter=Y" we will just end up with a lot of misplaced data and some churn, right? Or will the affected pool go degraded/unavailable?
>>> 
>>> Mvh.
>>> 
>>> Torkil
>>> -- 
>>> Torkil Svensgaard
>>> Sysadmin
>>> MR-Forskningssektionen, afs. 714
>>> DRCMR, Danish Research Centre for Magnetic Resonance
>>> Hvidovre Hospital
>>> Kettegård Allé 30
>>> DK-2650 Hvidovre
>>> Denmark
>>> Tel: +45 386 22828
>>> E-mail: torkil@xxxxxxxx
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> -- 
> Torkil Svensgaard
> Sysadmin
> MR-Forskningssektionen, afs. 714
> DRCMR, Danish Research Centre for Magnetic Resonance
> Hvidovre Hospital
> Kettegård Allé 30
> DK-2650 Hvidovre
> Denmark
> Tel: +45 386 22828
> E-mail: torkil@xxxxxxxx <mailto:torkil@xxxxxxxx>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux