As I'm watching the mgr balancer module optimizing the layout on the lab cluster I'm seeing a lot of cases where the recovery scheduling in luminous is broken. For example, pgs: 63895/153086292 objects degraded (0.042%) 1439665/153086292 objects misplaced (0.940%) 12166 active+clean 198 active+remapped+backfill_wait 97 active+remapped+backfilling 10 active+undersized+degraded+remapped+backfill_wait 6 active+recovery_wait+degraded 2 active+recovery_wait+degraded+remapped 1 active+undersized+degraded+remapped+backfilling It is "wrong" that any PGs would be in recovery_wait (a high priority log-based recovery activity) when there is a ton of backfill going on. I've fixed this in master with a few rounds of recovery preemption PRs, and shaken out a few other issues in the process, but can't backport it to luminous without burning a feature bit. I just took an inventory and we have 12 bits available, and I just marked 8 more deprecated that we can remove after the O release. So... I think it's worth burning one on this. Any objections? And I guess also, is there anything else we wish we could backport to luminous but would need to burn a feature bit to do it? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html