Getting placement groups to be placed evenly continues to be a major challenge for us, bordering on impossible. When we first reported trouble with this, the ceph cluster had 12 OSD's (each Intel DC S3700 400GB) spread across three nodes. Since then, it has grown to 8 nodes with 38 OSD's. The average utilization is 80%. With weights all set to 1, utlization varies from 53% to 96%. Immediately after "ceph osd reweight-by-utilization 105" it varies from 61% to 90%. Essentially, once utilization goes over 75%, managing the osd weights to keep all of them under 90% becomes a full-time job. This is on 0.80.9 with optimal tunables (including chooseleaf_vary_r=1 and straw_calc_version=1 setting. The pool has 2048 placement groups and has size=2. What, if anything, can we do about this? The goals are twofold, and in priority order: 1) Guarantee that the cluster can survive the loss of a node without dying because one "unlucky" OSD overfills. 2) Utilize the available space as efficiently as possible. We are targeting 85% utilization, but currently things to get ugly pretty quickly over 75%. Thanks for any advice! _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com