Re: Rebalancing causing IO Stall/IO Drops to zero

Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> · Thu, 11 May 2017 02:30:32 +0000

On Thu, May 4, 2017 at 8:40 AM Osama Hasebou <osama.hasebou@xxxxxx> wrote:
Hi Everyone,

We keep running into stalled IOs / they also drop almost to zero, whenever a node suddenly would go down or if there was a large amount of rebalancing going on and once rebalancing is completed, we would also get stalled io for 2-10 mins.

Has anyone seen this behaviour before and found a way to fix this? We are seeing this on Ceph Hammer and also on Jewel.

 Please check the setting "mon osd down out subtree limit". Setting to host would prevent automatic marking OSDs down when a host fails. 

Also 

osd-recovery-max-active (my setting is 5)
osd-recovery-threads (my setting is 3)
osd-max-backfills (my setting is 5)

Hth,
Alex

Thanks.

Regards,
Ossi

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
--Alex Gorbachev
Storcium
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com