Getting placement groups to place evenly (again)

J David <j.david.lists@xxxxxxxxx> · Tue, 7 Apr 2015 23:16:16 -0400

Getting placement groups to be placed evenly continues to be a major
challenge for us, bordering on impossible.

When we first reported trouble with this, the ceph cluster had 12
OSD's (each Intel DC S3700 400GB) spread across three nodes.  Since
then, it has grown to 8 nodes with 38 OSD's.

The average utilization is 80%.  With weights all set to 1, utlization
varies from 53% to 96%.  Immediately after "ceph osd
reweight-by-utilization 105" it varies from 61% to 90%.  Essentially,
once utilization goes over 75%, managing the osd weights to keep all
of them under 90% becomes a full-time job.

This is on 0.80.9 with optimal tunables (including chooseleaf_vary_r=1
and straw_calc_version=1 setting.  The pool has 2048 placement groups
and has size=2.

What, if anything, can we do about this?  The goals are twofold, and
in priority order:

1) Guarantee that the cluster can survive the loss of a node without
dying because one "unlucky" OSD overfills.

2) Utilize the available space as efficiently as possible.  We are
targeting 85% utilization, but currently things to get ugly pretty
quickly over 75%.

Thanks for any advice!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com