Re: backfill_toofull, but OSDs not full

Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> · Fri, 9 Jan 2015 10:39:58 -0800

What was the osd_backfill_full_ratio?  That's the config that controls backfill_toofull.  By default, it's 85%.  The mon_osd_*_ratio affect the ceph status.
I've noticed that it takes a while for backfilling to restart after changing osd_backfill_full_ratio.  Backfilling usually restarts for me in 10-15 minutes.  Some PGs will stay in that state until the cluster is nearly done recoverying.

I've only seen backfill_toofull happen after the OSD exceeds the ratio (so it's reactive, no proactive).  Mine usually happen when I'm rebalancing a nearfull cluster, and an OSD backfills itself toofull.

On Mon, Jan 5, 2015 at 11:32 AM, c3 <ceph-users@xxxxxxxxxx> wrote:
Hi,

I am wondering how a PG gets marked backfill_toofull.

I reweighted several OSDs using ceph osd crush reweight. As expected, PG began moving around (backfilling).

Some PGs got marked +backfilling (~10), some +wait_backfill (~100).

But some are marked +backfill_toofull. My OSDs are between 25% and 72% full.

Looking at ceph pg dump, I can find the backfill_toofull PGs and verified the OSDs involved are less than 72% full.

Do backfill reservations include a size? Are these OSDs projected to be toofull, once the current backfilling complete? Some of the backfill_toofull and backfilling point to the same OSDs.

I did adjust the full ratios, but that did not change the backfill_toofull status.

ceph tell mon.\* injectargs '--mon_osd_full_ratio 0.95'

ceph tell osd.\* injectargs '--osd_backfill_full_ratio 0.92'

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com