Ceph Balancer per Pool/Crush Unit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Cephers,

I’m starting to play with the Ceph Balancer plugin after moving to straw2 and running into something I’m surprised I haven’t seen posted here.

My cluster has two crush roots, one for HDD, one for SSD.

Right now, HDD’s are a single pool to themselves, SSD’s are a single pool to themselves.

Using Ceph Balancer Eval, I can see the eval score for the hdd’s (worse), and the ssd’s (better), and the blended score of the cluster overall.
pool “hdd" score 0.012529 (lower is better)
pool “ssd" score 0.004654 (lower is better)
current cluster score 0.008484 (lower is better)

My problem is that I need to get my hdd’s better, and stop touching my ssd's, because shuffling data wear’s the ssd's unnecessarily, and it has actually gotten the distribution worse over time. https://imgur.com/RVh0jfH
You can see that between 06:00 and 09:00 on the second day in the graph that the spread was very tight, and then it expanded back.

So my question is, how can I run the balancer on just my hdd’s without touching my ssd’s?

I removed about 15% of the PG’s living on the HDD’s because they were empty.
I also have two tiers of HDD’s 8TB’s and 2TB’s, but they are roughly equally weighted in crush at the chassis level where my failure domains are configured.
Hopefully this abbreviated ceph osd tree displays the hierarchy. Multipliers for that bucket on right.
ID  CLASS WEIGHT    TYPE NAME
 -1       218.49353 root default.hdd
-10       218.49353     rack default.rack-hdd
-70        43.66553         chassis hdd-2tb-chassis1 *1
-67        43.66553             host hdd-2tb-24-1       *1
 74   hdd   1.81940                 osd.74              *24
-55        43.70700         chassis hdd-8tb-chassis1    *4
 -2        21.85350             host hdd-8tb-3-1
  0   hdd   7.28450                 osd.0               *3
 -3        21.85350             host hdd-8tb-3-1
  1   hdd   7.28450                 osd.1               *3

I assume this doesn’t complicate totally, but figured I would mention it, as I assume it is more difficult to equally distribute across OSD’s that are 4:1 size delta.

If I create a plan plan1 with ceph balancer optimize plan1,
then do a show plan1, I see an entry:
ceph osd crush weight-set reweight-compat $OSD $ArbitraryNumberNearOsdSize

Could I then copy this output, remove entries for SSD OSD’s and then run the ceph osd crush weight-set reweight-compat commands in a script?

I and my SSD’s appreciate any insight.

Thanks,

Reed
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux