Re: No rebalance after ceph osd crush unlink

Frank Schilder <frans@xxxxxx> · Wed, 18 May 2022 10:47:04 +0000

Hi Dan,

thanks for pointing me to this. Yes, it looks like a/the bug, the shadow tree is not changed although it should be updated as well. This is not even shown in the crush map I exported with getcrushmap. The option --show-shadow did the trick.

Will `ceph osd crush reweight-all` actually remove these shadow leafs or just set the weight to 0? I need to link this host later again and I would like a solution as clean as possible. What would, for example, happen if I edit the crush map and execute setcrushmap? Will it recompile the correct crush map from the textual definition, or will these hanging leafs persist?

Thanks!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Dan van der Ster <dvanders@xxxxxxxxx>
Sent: 18 May 2022 12:04:07
To: Frank Schilder
Cc: ceph-users
Subject: Re:  No rebalance after ceph osd crush unlink

Hi Frank,

Did you check the shadow tree (the one with tilde's in the name, seen
with `ceph osd crush tree --show-shadow`)? Maybe the host was removed
in the outer tree, but not the one used for device-type selection.
There were bugs in this area before, e.g. https://tracker.ceph.com/issues/48065
In those cases, the way to make the crush tree consistent again was
`ceph osd crush reweight-all`.

Cheers, Dan

On Wed, May 18, 2022 at 11:51 AM Frank Schilder <frans@xxxxxx> wrote:
>
> Dear all,
>
> I have a strange problem. I have some hosts linked under an additional logical data center and needed to unlink two of the hosts. After unlinking the first host with
>
> ceph osd crush unlink ceph-18 MultiSite
>
> the crush map for this data center is updated correctly:
>
> datacenter MultiSite {
>         id -148         # do not change unnecessarily
>         id -149 class hdd               # do not change unnecessarily
>         id -150 class ssd               # do not change unnecessarily
>         id -236 class rbd_meta          # do not change unnecessarily
>         id -200 class rbd_data          # do not change unnecessarily
>         id -320 class rbd_perf          # do not change unnecessarily
>         # weight 643.321
>         alg straw2
>         hash 0  # rjenkins1
>         item ceph-04 weight 79.691
>         item ceph-05 weight 81.474
>         item ceph-06 weight 79.691
>         item ceph-07 weight 79.691
>         item ceph-19 weight 81.695
>         item ceph-20 weight 81.695
>         item ceph-21 weight 79.691
>         item ceph-22 weight 79.691
> }
>
> The host is gone. However, nothing happened. The pools with the crush rule
>
> rule ms-ssd {
>         id 12
>         type replicated
>         min_size 1
>         max_size 10
>         step take MultiSite class rbd_data
>         step chooseleaf firstn 0 type host
>         step emit
> }
>
> should now move data away from OSDs on this host, but nothing is happening. A pool with crush rule ms-ssd is:
>
> # ceph osd pool get sr-rbd-meta-one all
> size: 3
> min_size: 2
> pg_num: 128
> pgp_num: 128
> crush_rule: ms-ssd
> hashpspool: true
> nodelete: true
> nopgchange: false
> nosizechange: false
> write_fadvise_dontneed: false
> noscrub: false
> nodeep-scrub: false
> use_gmt_hitset: 1
> auid: 0
> fast_read: 0
>
> However, its happily keeping data on the OSDs of host ceph-18. For example, one of the OSDs on this host has ID 1076. There are 4 PGs using this OSD:
>
> # ceph pg ls-by-pool sr-rbd-meta-one | grep 1076
> 1.33     250        0         0       0 756156481        7834        125 3073 active+clean 2022-05-18 10:54:41.840097 757122'10112944  757122:84604327    [574,286,1076]p574    [574,286,1076]p574 2022-05-18 04:24:32.900261 2022-05-11 19:56:32.781889
> 1.3d     259        0         0       0 796239360        3380         64 3006 active+clean 2022-05-18 10:54:41.749090 757122'24166942  757122:57010202 [1074,1076,1052]p1074 [1074,1076,1052]p1074 2022-05-18 06:16:35.605026 2022-05-16 19:37:56.829763
> 1.4d     249        0         0       0 713678948        5690        105 3070 active+clean 2022-05-18 10:54:41.738918  757119'5861104  757122:45718157  [1072,262,1076]p1072  [1072,262,1076]p1072 2022-05-18 06:50:04.731194 2022-05-18 06:50:04.731194
> 1.70     272        0         0       0 814317398        4591         76 3007 active+clean 2022-05-18 10:54:41.743604 757122'11849453  757122:72537747    [268,279,1076]p268    [268,279,1076]p268 2022-05-17 15:43:46.512941 2022-05-17 15:43:46.512941
>
> I don't understand why these are not remapped and rebalancing. Any ideas?
>
> Version is mimic latest.
>
> Thanks and best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx