Re: Adding datacenter level to CRUSH tree causes rebalancing

Michel Jouvin <michel.jouvin@xxxxxxxxxxxxxxx> · Thu, 20 Jul 2023 22:16:21 +0200

Hi Niklas,

As I said, ceph placement is based on more than fulfilling the failure 
domain constraint. This is a core feature in ceph design. There is no 
reason for a rebalancing on a cluster with a few hundreds OSDs to last a 
month. Just before 17 you have to adjust the max backfills parameter whose 
default is 1, a very conservative value. Using 2 should already reduce to 
rebalancing to a few days. But my experience shows that if it an option, 
upgrading to quincy first may be a better option due to to the autotuning 
of the number of backfills based on the real load of the cluster.

If your cluster is using cephadm, upgrading to quincy is very 
straightforward and should be complete I. A couple of hours for the cluster 
size I mentioned.

Cheers,

Michel
Sent from my mobile
Le 20 juillet 2023 20:15:54 Niklas Hambüchen <mail@xxxxxx> a écrit :

Thank you both Michel and Christian.

Looks like I will have to do the rebalancing eventually.
From past experience with Ceph 16 the rebalance will likely take at least a 
month with my 500 M objects.

It seems like a good idea to upgrade to Ceph 17 first as Michel suggests.

Unless:

I was hoping that Ceph might have a way to reduce the rebalancing, given 
that all constraints about failure domains are already fulfilled.

In particular, I was wondering whether I could play with the names of the 
"datacenter"s, to bring them in the same (alphabetical?) order as the hosts 
were so far.
I suspect that this is what avoided the reshuffling on my my mini test cluster.
I think it would be in alignment with Table 1 from the CRUSH paper: 
https://ceph.com/assets/pdfs/weil-crush-sc06.pdf

E.g. perhaps

take(root)
select(1, row)
select(3, cabinet)
emit

yields the same result as

take(root)
select(3, row)
select(1, cabinet)
emit

?

Niklas
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx