Hi Brad,
thank you for that. Unfortunately our immediate concern is the blocked ops rather than the broken pg (we know why its broken). On 12 August 2016 at 04:12, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
On Thu, Aug 11, 2016 at 11:33:29PM +0100, Roeland Mertens wrote:
> Hi,
>
> I was hoping someone on this list may be able to help?
>
> We're running a 35 node 10.2.1 cluster with 595 OSDs. For the last 12 hours
> we've been plagued with blocked requests which completely kills the
> performance of the cluster
>
> # ceph health detail
> HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs down; 1
> pgs peering; 1 pgs stuck inactive; 100 requests are blocked > 32 sec; 1 osds
> have slow requests; noout,nodeep-scrub,sortbitwise flag(s) set
> pg 63.1a18 is stuck inactive for 135133.509820, current state
> down+remapped+peering, last acting [2147483647,2147483647,2147483647,2147483647, That value (2147483647) is defined in src/crush/crush.h like so;2147483647,2147483647,235,148, 290,300,147,157,370]
#define CRUSH_ITEM_NONE 0x7fffffff /* no result */
So this could be due to a bad crush rule or maybe choose_total_tries needs to
be higher?
$ ceph osd crush rule ls
For each rule listed by the above command.
$ ceph osd crush rule dump [rule_name]
I'd then dump out the crushmap and test it showing any bad mappings with the
commands listed here;
http://docs.ceph.com/docs/master/rados/troubleshooting/ troubleshooting-pg/#crush- gives-up-too-soon
I'd also check the pg numbers for your pool(s) are appropriate as not enough
pgs could also be a contributing factor IIRC.
That should hopefully give some insight.
--
HTH,
Brad
> --
> pg 63.1a18 is down+remapped+peering, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,235,148,290,300,147,157,370]
> 100 ops are blocked > 2097.15 sec on osd.4
> 1 osds have slow requests
> noout,nodeep-scrub,sortbitwise flag(s) set
>
> the one pg down is due to us running into an odd EC issue which I mailed the
> list about earlier, it's the 100 blocked ops that are puzzling us. If we out
> the osd in question, they just shift to another osd (on a different host!).
> We even tried rebooting the node it's on but to little avail.
>
> We get a ton of log messages like this:
>
> 2016-08-11 23:32:10.041174 7fc668d9f700 0 log_channel(cluster) log [WRN] :
> 100 slow requests, 5 included below; oldest blocked for > 139.313915 secs
> 2016-08-11 23:32:10.041184 7fc668d9f700 0 log_channel(cluster) log [WRN] :
> slow request 139.267004 seconds old, received at 2016-08-11 23:29:50.774091:
> osd_op(client.9192464.0:485640 66.b96c3a18
> default.4282484.42_442fac8195c63a2e19c3c4bb91e880 0e [getxattrs,stat,read
> 0~524288] snapc 0=[] RETRY=36 ack+retry+read+known_if_redirected e50109)
> currently waiting for blocked object
> 2016-08-11 23:32:10.041189 7fc668d9f700 0 log_channel(cluster) log [WRN] :
> slow request 139.244839 seconds old, received at 2016-08-11 23:29:50.796256:
> osd_op(client.9192464.0:596033 66.942a5a18
> default.4282484.30__shadow_.sLkZ_ rUX6cvi0ifFasw1UipEIuFPzYB_6 [write
> 1048576~524288] snapc 0=[] RETRY=36
> ack+ondisk+retry+write+known_if_redirected e50109) currently waiting for
> blocked object
>
> A dump of the blocked ops tells us very little , is there anyone who can
> shed some light on this? Or at least give us a hint on how we can fix this?
>
> # ceph daemon osd.4 dump_blocked_ops
> ....
>
> {
> "description": "osd_op(client.9192464.0:596030 66.942a5a18
> default.4282484.30__shadow_.sLkZ_ rUX6cvi0ifFasw1UipEIuFPzYB_6 [writefull
> 0~0] snapc 0=[] RETRY=32 ack+ondisk+retry+write+known_if_redirected
> e50092)",
> "initiated_at": "2016-08-11 22:58:09.721027",
> "age": 1515.105186,
> "duration": 1515.113255,
> "type_data": [
> "reached pg",
> {
> "client": "client.9192464",
> "tid": 596030
> },
> [
> {
> "time": "2016-08-11 22:58:09.721027",
> "event": "initiated"
> },
> {
> "time": "2016-08-11 22:58:09.721066",
> "event": "waiting_for_map not empty"
> },
> {
> "time": "2016-08-11 22:58:09.813574",
> "event": "reached_pg"
> },
> {
> "time": "2016-08-11 22:58:09.813581",
> "event": "waiting for peered"
> },
> {
> "time": "2016-08-11 22:58:09.852796",
> "event": "reached_pg"
> },
> {
> "time": "2016-08-11 22:58:09.852804",
> "event": "waiting for peered"
> },
> {
> "time": "2016-08-11 22:58:10.876636",
> "event": "reached_pg"
> },
> {
> "time": "2016-08-11 22:58:10.876640",
> "event": "waiting for peered"
> },
> {
> "time": "2016-08-11 22:58:10.902760",
> "event": "reached_pg"
> }
> ]
> ]
> }
> ...
>
>
> Kind regards,
>
>
> Roeland
>
>
> This email is sent on behalf of Genomics plc, a public limited company
> registered in England and Wales with registered number 8839972, VAT
> registered number 189 2635 65 and registered office at King Charles House,
> Park End Street, Oxford, OX1 1JD, United Kingdom.
> The contents of this e-mail and any attachments are confidential to the
> intended recipient. If you are not the intended recipient please do not use
> or publish its contents, contact Genomics plc immediately at
> info@xxxxxxxxxxxxxxx <info@xxxxxxxxxxxxxxx> then delete. You may not copy,
> forward, use or disclose the contents of this email to anybody else if you
> are not the intended recipient. Emails are not secure and may contain
> viruses.
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
--
--
Roeland MertensThis email is sent on behalf of Genomics plc, a public limited company registered in England and Wales with registered number 8839972, VAT registered number 189 2635 65 and registered office at King Charles House, Park End Street, Oxford, OX1 1JD, United Kingdom.
The contents of this e-mail and any attachments are confidential
to the intended recipient. If you are not the intended recipient please
do not use or publish its contents, contact Genomics plc immediately at info@xxxxxxxxxxxxxxx then delete. You may not copy,
forward, use or disclose the contents of this email to anybody else if
you are not the intended recipient. Emails are not secure and may contain viruses.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com