Re: Safe to move misplaced hosts between failure domains in the crush tree?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 13/06/2024 12:17, Bandelow, Gunnar wrote:
Hi Torkil,

Hi Gunnar

Maybe im overlooking something, but how about just renaming the datacenter buckets?

Here's the ceph osd tree command header and my pruned tree:

ID    CLASS  WEIGHT      TYPE NAME                 STATUS  REWEIGHT  PRI-AFF
  -1         4437.29248  root default
 -33         1467.84814      datacenter 714
 -69           69.86389          host ceph-flash1
 -34         1511.25378      datacenter HX1
 -73           69.86389          host ceph-flash2
 -36         1458.19067      datacenter UXH
 -77           69.86389          host ceph-flash3

The weights reveal that there are other hosts in the datacenter buckets so renaming won't help.

Mvh.

Torkil

Best regards,
Gunnar

--- Original Nachricht ---
*Betreff: * Re: Safe to move misplaced hosts between failure domains in the crush tree?
*Von: *"Torkil Svensgaard" <torkil@xxxxxxxx <mailto:torkil@xxxxxxxx>>
*An: *"Matthias Grandl" <matthias.grandl@xxxxxxxx <mailto:matthias.grandl@xxxxxxxx>> *CC: *ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>, "Ruben Vestergaard" <rkv@xxxxxxxx <mailto:rkv@xxxxxxxx>>
*Datum: *12-06-2024 10:33



    On 12/06/2024 10:22, Matthias Grandl wrote:
     > Correct, this should only result in misplaced objects.
     >
     >  > We made a mistake when we moved the servers physically so
    while the
     > replica 3 is intact the crush tree is not accurate.
     >
     > Can you elaborate on that? Does this mean after the move,
    multiple hosts
     > are inside the same physical datacenter? In that case, once you
    correct
     > the CRUSH layout, you would be running misplaced without a way to
     > rebalance pools that are you using a datacenter crush rule.

    Hi Matthias

    Thanks for replying. Two of the three hosts was swapped so I would do:

    ceph osd crush move ceph-flash1 datacenter=HX1
    ceph osd crush move ceph-flash2 datacenter=714


    And end up with 2/3 misplaced:

        -1         4437.29248  root default
       -33         1467.84814      datacenter 714
       -69           69.86389          host ceph-flash2
       -34         1511.25378      datacenter HX1
       -73           69.86389          host ceph-flash1
       -36         1458.19067      datacenter UXH
       -77           69.86389          host ceph-flash3

    It would only briefly be invalid between the two commands.

    Mvh.

    Torkil


     > Cheers!
     >
     > --
     >
     > Matthias Grandl
     > Head Storage Engineer
     > matthias.grandl@xxxxxxxx <mailto:matthias.grandl@xxxxxxxx>
    <matthias.grandl@xxxxxxxx <mailto:matthias.grandl@xxxxxxxx>>
     >
     > Looking for help with your Ceph cluster? Contact us at
    https://croit <https://croit>
     > <https://croit <https://croit>>.io
     >
     > croit GmbH, Freseniusstr. 31h, 81247 Munich
     > CEO: Martin Verges - VAT-ID: DE310638492
     > Com. register: Amtsgericht Munich HRB 231263
     > Web: https://croit.io <https://croit.io> | YouTube:
    https://goo.gl/PGE1Bx <https://goo.gl/PGE1Bx>
     >
     >> On 12. Jun 2024, at 09:13, Torkil Svensgaard <torkil@xxxxxxxx
    <mailto:torkil@xxxxxxxx>> wrote:
     >>
     >> Hi
     >>
     >> We have 3 servers for replica 3 with failure domain datacenter:
     >>
     >>  -1         4437.29248  root default
     >> -33         1467.84814      datacenter 714
     >> -69           69.86389          host ceph-flash1
     >> -34         1511.25378      datacenter HX1
     >> -73           69.86389          host ceph-flash2
     >> -36         1458.19067      datacenter UXH
     >> -77           69.86389          host ceph-flash3
     >>
     >> We made a mistake when we moved the servers physically so while the
     >> replica 3 is intact the crush tree is not accurate.
     >>
     >> If we just remedy the situation with "ceph osd crush move
    ceph-flashX
     >> datacenter=Y" we will just end up with a lot of misplaced data and
     >> some churn, right? Or will the affected pool go
    degraded/unavailable?
     >>
     >> Mvh.
     >>
     >> Torkil
     >> --
     >> Torkil Svensgaard
     >> Sysadmin
     >> MR-Forskningssektionen, afs. 714
     >> DRCMR, Danish Research Centre for Magnetic Resonance
     >> Hvidovre Hospital
     >> Kettegård Allé 30
     >> DK-2650 Hvidovre
     >> Denmark
     >> Tel: +45 386 22828
     >> E-mail: torkil@xxxxxxxx <mailto:torkil@xxxxxxxx>
     >> _______________________________________________
     >> ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
     >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>
     >

-- Torkil Svensgaard
    Sysadmin
    MR-Forskningssektionen, afs. 714
    DRCMR, Danish Research Centre for Magnetic Resonance
    Hvidovre Hospital
    Kettegård Allé 30
    DK-2650 Hvidovre
    Denmark
    Tel: +45 386 22828
    E-mail: torkil@xxxxxxxx <mailto:torkil@xxxxxxxx>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: torkil@xxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux