Hi Reed,
This has already been changed and is currently included in Kraken+
(https://github.com/ceph/ceph/pull/12389).
It does exactly what you want - backfills for inactive PGs (where copies
< min_size) have higher priority,
even higher than recovery to minimize the time when there are inactive
PGs in the cluster.
Hopefully it will be backported to jewel:
http://tracker.ceph.com/issues/18350,
I'll try to prepare backport for this today, there's a chance it would
be a part of 10.2.6.
When we had this kind of issue on a bit larger cluster (400+ OSDs), the
only thing we could do was to
play with osd_max_backfills and recovery settings - i.e. you check what
OSDs are holding inactive PG,
increase max/prioritize backfills and recovery on those and disable
backfill/recovery on all others.
Regards,
Bartek
On 02/01/2017 07:25 PM, Reed Dier wrote:
Have a smallish cluster that has been expanding with almost a 50% increase in the number of OSD (16->24).
This has caused some issues with data integrity and cluster performance as we have increased PG count, and added OSDs.
8x nodes with 3x drives, connected over 2x10G.
My problem is that I have PG’s that have become grossly undersized (size=3,min_size=2), in some cases just 1 copy, which created a deadlock of io, backed up behind this PG without enough copies.
It has been backfilling and recovering at a steady pace, but it seems that all backfills are weighted equally, and the more serious PG’s could be at the front of the queue, or at the very end of the queue, with no apparent rhyme or reason.
This has been exasperated by a failing, but not failed OSD, which I have moved out, but still up, in an attempt for it to try and move its data off of itself gracefully, and not take on new io.
I guess my question would be “is there a way to get the most important/critical recovery/backfill operations completed ahead of less important/critical backfill/recovery.” i.e., tackle the 1 copy PG’s that are blocking io, ahead of the less used PG’s that have 2 copies and backfilling to make their 3rd.
Thanks,
Reed
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com