Re: pg's stuck for 4-5 days after reaching backfill_toofull

JIten Shah <jshah2005@xxxxxx> · Tue, 11 Nov 2014 14:00:16 -0800

Actually there were 100’s that were too full. We manually set the OSD weights to 0.5 and it seems to be recovering.
Thanks of the tips on crush reweight. I will look into it.

—Jiten

On Nov 11, 2014, at 1:37 PM, Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> wrote:

How many OSDs are nearfull?

I've seen Ceph want two toofull OSDs to swap PGs.  In that case, I dynamically raised mon_osd_nearfull_ratio and osd_backfill_full_ratio a bit, then put it back to normal once the scheduling deadlock finished. 
Keep in mind that ceph osd reweight is temporary.  If you mark an osd OUT then IN, the weight will be set to 1.0.  If you need something that's persistent, you can use ceph osd crush reweight osd.NUM <crust_weight>.  Look at ceph osd tree to get the current weight.

I also recommend stepping towards your goal.  Changing either weight can cause a lot of unrelated migrations, and the crush weight seems to cause more than the osd weight.  I step osd weight by 0.125, and crush weight by 0.05.

On Tue, Nov 11, 2014 at 12:47 PM, Chad Seys <cwseys@xxxxxxxxxxxxxxxx> wrote:
Find out which OSD it is:

ceph health detail

Squeeze blocks off the affected OSD:

ceph osd reweight OSDNUM 0.8

Repeat with any OSD which becomes toofull.

Your cluster is only about 50% used, so I think this will be enough.

Then when it finishes, allow data back on OSD:

ceph osd reweight OSDNUM 1

Hopefully ceph will someday be taught to move PGs in a better order!

Chad.

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com