On Thu, May 4, 2017 at 8:40 AM Osama Hasebou <osama.hasebou@xxxxxx> wrote:
Hi Everyone,We keep running into stalled IOs / they also drop almost to zero, whenever a node suddenly would go down or if there was a large amount of rebalancing going on and once rebalancing is completed, we would also get stalled io for 2-10 mins.Has anyone seen this behaviour before and found a way to fix this? We are seeing this on Ceph Hammer and also on Jewel.
Please check the setting "mon osd down out subtree limit". Setting to host would prevent automatic marking OSDs down when a host fails.
Also
osd-recovery-max-active (my setting is 5)
osd-recovery-threads (my setting is 3)osd-max-backfills (my setting is 5)
Hth,
Alex
_______________________________________________Thanks.Regards,
Ossi
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
--
Alex Gorbachev
Storcium
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com