No rebalance after ceph osd crush unlink

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear all,

I have a strange problem. I have some hosts linked under an additional logical data center and needed to unlink two of the hosts. After unlinking the first host with

ceph osd crush unlink ceph-18 MultiSite

the crush map for this data center is updated correctly:

datacenter MultiSite {
	id -148		# do not change unnecessarily
	id -149 class hdd		# do not change unnecessarily
	id -150 class ssd		# do not change unnecessarily
	id -236 class rbd_meta		# do not change unnecessarily
	id -200 class rbd_data		# do not change unnecessarily
	id -320 class rbd_perf		# do not change unnecessarily
	# weight 643.321
	alg straw2
	hash 0	# rjenkins1
	item ceph-04 weight 79.691
	item ceph-05 weight 81.474
	item ceph-06 weight 79.691
	item ceph-07 weight 79.691
	item ceph-19 weight 81.695
	item ceph-20 weight 81.695
	item ceph-21 weight 79.691
	item ceph-22 weight 79.691
}

The host is gone. However, nothing happened. The pools with the crush rule

rule ms-ssd {
	id 12
	type replicated
	min_size 1
	max_size 10
	step take MultiSite class rbd_data
	step chooseleaf firstn 0 type host
	step emit
}

should now move data away from OSDs on this host, but nothing is happening. A pool with crush rule ms-ssd is:

# ceph osd pool get sr-rbd-meta-one all
size: 3
min_size: 2
pg_num: 128
pgp_num: 128
crush_rule: ms-ssd
hashpspool: true
nodelete: true
nopgchange: false
nosizechange: false
write_fadvise_dontneed: false
noscrub: false
nodeep-scrub: false
use_gmt_hitset: 1
auid: 0
fast_read: 0

However, its happily keeping data on the OSDs of host ceph-18. For example, one of the OSDs on this host has ID 1076. There are 4 PGs using this OSD:

# ceph pg ls-by-pool sr-rbd-meta-one | grep 1076
1.33     250        0         0       0 756156481        7834        125 3073 active+clean 2022-05-18 10:54:41.840097 757122'10112944  757122:84604327    [574,286,1076]p574    [574,286,1076]p574 2022-05-18 04:24:32.900261 2022-05-11 19:56:32.781889 
1.3d     259        0         0       0 796239360        3380         64 3006 active+clean 2022-05-18 10:54:41.749090 757122'24166942  757122:57010202 [1074,1076,1052]p1074 [1074,1076,1052]p1074 2022-05-18 06:16:35.605026 2022-05-16 19:37:56.829763 
1.4d     249        0         0       0 713678948        5690        105 3070 active+clean 2022-05-18 10:54:41.738918  757119'5861104  757122:45718157  [1072,262,1076]p1072  [1072,262,1076]p1072 2022-05-18 06:50:04.731194 2022-05-18 06:50:04.731194 
1.70     272        0         0       0 814317398        4591         76 3007 active+clean 2022-05-18 10:54:41.743604 757122'11849453  757122:72537747    [268,279,1076]p268    [268,279,1076]p268 2022-05-17 15:43:46.512941 2022-05-17 15:43:46.512941 

I don't understand why these are not remapped and rebalancing. Any ideas?

Version is mimic latest.

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux