Hi everyone,
I am facing the classical data distribution issue, but it seems to be affecting only one of the 2 pools of my cluster.
Here are the numbers :
- Ceph is 9.2.0 (Infernalis)
- The cluster is 84 OSD (4TB) in 7 nodes
- 2 pools are used : the first is size=3 and cache tier for the second one (EC, 4+2)
- cache pool has 1024 pg (we started at 512 and then increased to 1024), data pool has 2048 pg
- all the OSDs are used for both pools
- the crushmap is basically the default one
- Cluster overall utilization is 66%
- pg per OSD varies from 154 to 226
- OSD utilization varies from 47% (with 171 pg) to 90% (with 204 pg)
The strange thing is the pg size for each pool :
- cache pool minimum pg size is 39,6GB, max is 40,9GB
- data pool minimum pg size is 18,6GB, max is 31,3GB
Cache pool currently has 46TB of data and data pool 50TB.
The number of pg per pool were defined using pgcalc with 25% of the data in cache pool and 75% in data pool.
I cannot explain the difference between the 2 pools, it seems to me that data should be more evenly distributed in the data pool, especially given its current utilization.
Am I just unlucky with a lot of the big pg ending on one OSD, or is there something funny going on ?
How can I try to correct the data distribution ? I have not played with OSD weights (either manually or via reweight-by-utilization) yet.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com