Question on cluster balance and data distribution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

In our current production cluster we have the following CRUSH
hierarchy, see https://pastebin.com/640Q4XSH or the attached image.

This reflects 1:1 real physical deployment. We currently use also a
replica factor of 3 with the following CRUSH rule on our pools:

rule hdd_replicated {
  id 0
  type replicated
  min_size 1
  max_size 10
  step take hdd
  step chooseleaf firstn 0 type rack
  step emit
}

I can imagine with such design it's hard to achieve a good data
distribution. Right? Moreover, we are currently getting a lot of "near
full" warnings for several OSDs and had one full OSD, but the cluster
has more than 400 TB free space available from a total cluster size of
1,2 PB.

So what could we consider:

1) Changing the crush rule above in away to achieve better data distribution?

2) Adding a third "dummy" datacenter with 2 "dummy" racks and
distribute the hosts evenly across all datacenters? Then use a crush
rule with "step chooseleaf firstn 0 type datacenter".

3) Adding a third "dummy" datacenter each with on Rack and then
distribute the hosts evenly across? Then use a crush rule with "step
chooseleaf firstn 0 type datacenter".

4) Other suggestions?

We a currently running on Luminous 12.2.4.

Thanks for any feedback...

Best,
Martin

Attachment: ceph_cursh_hierarchy.png
Description: PNG image

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux