Re: Blocked requests problem

Manuel Lausch <manuel.lausch@xxxxxxxx> · Wed, 23 Aug 2017 11:32:51 +0200

Hi,

Sometimes we have the same issue on our 10.2.9 Cluster. (24 Nodes á 60
OSDs)

I think there is some racecondition or something like that
which results in this state. The blocking requests starts exactly at
the time the PG begins to scrub. 

you can try the following. The OSD will automaticaly recover and the
blocked requests will disapear.

ceph osd down 31 

In my opinion this is a bug but I have note investigated so far. Mayby
some developer can say something about this issue 

Regards,
Manuel

Am Tue, 22 Aug 2017 16:20:14 +0300
schrieb Ramazan Terzi <ramazanterzi@xxxxxxxxx>:

> Hello,
> 
> I have a Ceph Cluster with specifications below:
> 3 x Monitor node
> 6 x Storage Node (6 disk per Storage Node, 6TB SATA Disks, all disks
> have SSD journals) Distributed public and private networks. All NICs
> are 10Gbit/s osd pool default size = 3
> osd pool default min size = 2
> 
> Ceph version is Jewel 10.2.6.
> 
> My cluster is active and a lot of virtual machines running on it
> (Linux and Windows VM's, database clusters, web servers etc).
> 
> During normal use, cluster slowly went into a state of blocked
> requests. Blocked requests periodically incrementing. All OSD's seems
> healthy. Benchmark, iowait, network tests, all of them succeed.
> 
> Yerterday, 08:00:
> $ ceph health detail
> HEALTH_WARN 3 requests are blocked > 32 sec; 3 osds have slow requests
> 1 ops are blocked > 134218 sec on osd.31
> 1 ops are blocked > 134218 sec on osd.3
> 1 ops are blocked > 8388.61 sec on osd.29
> 3 osds have slow requests
> 
> Todat, 16:05:
> $ ceph health detail
> HEALTH_WARN 32 requests are blocked > 32 sec; 3 osds have slow
> requests 1 ops are blocked > 134218 sec on osd.31
> 1 ops are blocked > 134218 sec on osd.3
> 16 ops are blocked > 134218 sec on osd.29
> 11 ops are blocked > 67108.9 sec on osd.29
> 2 ops are blocked > 16777.2 sec on osd.29
> 1 ops are blocked > 8388.61 sec on osd.29
> 3 osds have slow requests
> 
> $ ceph pg dump | grep scrub
> dumped all in format plain
> pg_stat	objects	mip	degr	misp
> unf	bytes	log	disklog	state
> state_stamp	v	reported	up
> up_primary	acting	acting_primary
> last_scrub	scrub_stamp	last_deep_scrub
> deep_scrub_stamp 20.1e	25183	0	0	0
> 0	98332537930	3066	3066
> active+clean+scrubbing	2017-08-21 04:55:13.354379
> 6930'23908781	6930:20905696	[29,31,3]	29
> [29,31,3]	29	6712'22950171	2017-08-20
> 04:46:59.208792	6712'22950171	2017-08-20 04:46:59.208792
> 
> Active scrub does not finish (about 24 hours). I did not restart any
> OSD meanwhile. I'm thinking set noscrub, noscrub-deep, norebalance,
> nobackfill, and norecover flags and restart 3,29,31th OSDs. Is this
> solve my problem? Or anyone has suggestion about this problem?
> 
> Thanks,
> Ramazan
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lausch@xxxxxxxx | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen

Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com