On Fri, Aug 12, 2016 at 07:47:54AM +0100, roeland mertens wrote: > Hi Brad, > > thank you for that. Unfortunately our immediate concern is the blocked ops > rather than the broken pg (we know why its broken). OK, if you look at the following file it shows not only the declaration of wait_for_blocked_object (highlighted) but also all of it's callers. https://github.com/ceph/ceph/blob/master/src/osd/ReplicatedPG.cc#L500 Multiple calls relate to snapshots but I'd suggest turning debug logging for the OSDs right up may give us more information. # ceph tell osd.* injectargs '--debug_osd 20 --debug_ms 5' Note: The above will turn up debugging for all OSDs, you may want to only focus on some so adjust accordingly. > I don't think that's specifically crushmap related nor related to the > broken pg as the osds involved in the blocked ops aren't the ones that were > hosting the broken pg. > > > > > On 12 August 2016 at 04:12, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote: > > > On Thu, Aug 11, 2016 at 11:33:29PM +0100, Roeland Mertens wrote: > > > Hi, > > > > > > I was hoping someone on this list may be able to help? > > > > > > We're running a 35 node 10.2.1 cluster with 595 OSDs. For the last 12 > > hours > > > we've been plagued with blocked requests which completely kills the > > > performance of the cluster > > > > > > # ceph health detail > > > HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs > > down; 1 > > > pgs peering; 1 pgs stuck inactive; 100 requests are blocked > 32 sec; 1 > > osds > > > have slow requests; noout,nodeep-scrub,sortbitwise flag(s) set > > > pg 63.1a18 is stuck inactive for 135133.509820, current state > > > down+remapped+peering, last acting [2147483647,2147483647, > > 2147483647,2147483647,2147483647,2147483647,235,148,290,300,147,157,370] > > > > That value (2147483647) is defined in src/crush/crush.h like so; > > > > #define CRUSH_ITEM_NONE 0x7fffffff /* no result */ > > > > So this could be due to a bad crush rule or maybe choose_total_tries needs > > to > > be higher? > > > > $ ceph osd crush rule ls > > > > For each rule listed by the above command. > > > > $ ceph osd crush rule dump [rule_name] > > > > I'd then dump out the crushmap and test it showing any bad mappings with > > the > > commands listed here; > > > > http://docs.ceph.com/docs/master/rados/troubleshooting/ > > troubleshooting-pg/#crush-gives-up-too-soon > > > > I'd also check the pg numbers for your pool(s) are appropriate as not > > enough > > pgs could also be a contributing factor IIRC. > > > > That should hopefully give some insight. > > > > -- > > HTH, > > Brad > > > > > pg 63.1a18 is down+remapped+peering, acting [2147483647,2147483647, > > 2147483647,2147483647,2147483647,2147483647,235,148,290,300,147,157,370] > > > 100 ops are blocked > 2097.15 sec on osd.4 > > > 1 osds have slow requests > > > noout,nodeep-scrub,sortbitwise flag(s) set > > > > > > the one pg down is due to us running into an odd EC issue which I mailed > > the > > > list about earlier, it's the 100 blocked ops that are puzzling us. If we > > out > > > the osd in question, they just shift to another osd (on a different > > host!). > > > We even tried rebooting the node it's on but to little avail. > > > > > > We get a ton of log messages like this: > > > > > > 2016-08-11 23:32:10.041174 7fc668d9f700 0 log_channel(cluster) log > > [WRN] : > > > 100 slow requests, 5 included below; oldest blocked for > 139.313915 secs > > > 2016-08-11 23:32:10.041184 7fc668d9f700 0 log_channel(cluster) log > > [WRN] : > > > slow request 139.267004 seconds old, received at 2016-08-11 > > 23:29:50.774091: > > > osd_op(client.9192464.0:485640 66.b96c3a18 > > > default.4282484.42_442fac8195c63a2e19c3c4bb91e8800e [getxattrs,stat,read > > > 0~524288] snapc 0=[] RETRY=36 ack+retry+read+known_if_redirected e50109) > > > currently waiting for blocked object > > > 2016-08-11 23:32:10.041189 7fc668d9f700 0 log_channel(cluster) log > > [WRN] : > > > slow request 139.244839 seconds old, received at 2016-08-11 > > 23:29:50.796256: > > > osd_op(client.9192464.0:596033 66.942a5a18 > > > default.4282484.30__shadow_.sLkZ_rUX6cvi0ifFasw1UipEIuFPzYB_6 [write > > > 1048576~524288] snapc 0=[] RETRY=36 > > > ack+ondisk+retry+write+known_if_redirected e50109) currently waiting for > > > blocked object > > > > > > A dump of the blocked ops tells us very little , is there anyone who can > > > shed some light on this? Or at least give us a hint on how we can fix > > this? > > > > > > # ceph daemon osd.4 dump_blocked_ops > > > .... > > > > > > { > > > "description": "osd_op(client.9192464.0:596030 66.942a5a18 > > > default.4282484.30__shadow_.sLkZ_rUX6cvi0ifFasw1UipEIuFPzYB_6 [writefull > > > 0~0] snapc 0=[] RETRY=32 ack+ondisk+retry+write+known_if_redirected > > > e50092)", > > > "initiated_at": "2016-08-11 22:58:09.721027", > > > "age": 1515.105186, > > > "duration": 1515.113255, > > > "type_data": [ > > > "reached pg", > > > { > > > "client": "client.9192464", > > > "tid": 596030 > > > }, > > > [ > > > { > > > "time": "2016-08-11 22:58:09.721027", > > > "event": "initiated" > > > }, > > > { > > > "time": "2016-08-11 22:58:09.721066", > > > "event": "waiting_for_map not empty" > > > }, > > > { > > > "time": "2016-08-11 22:58:09.813574", > > > "event": "reached_pg" > > > }, > > > { > > > "time": "2016-08-11 22:58:09.813581", > > > "event": "waiting for peered" > > > }, > > > { > > > "time": "2016-08-11 22:58:09.852796", > > > "event": "reached_pg" > > > }, > > > { > > > "time": "2016-08-11 22:58:09.852804", > > > "event": "waiting for peered" > > > }, > > > { > > > "time": "2016-08-11 22:58:10.876636", > > > "event": "reached_pg" > > > }, > > > { > > > "time": "2016-08-11 22:58:10.876640", > > > "event": "waiting for peered" > > > }, > > > { > > > "time": "2016-08-11 22:58:10.902760", > > > "event": "reached_pg" > > > } > > > ] > > > ] > > > } > > > ... > > > > > > > > > Kind regards, > > > > > > > > > Roeland > > > > > > > > > -- > > > This email is sent on behalf of Genomics plc, a public limited company > > > registered in England and Wales with registered number 8839972, VAT > > > registered number 189 2635 65 and registered office at King Charles > > House, > > > Park End Street, Oxford, OX1 1JD, United Kingdom. > > > The contents of this e-mail and any attachments are confidential to the > > > intended recipient. If you are not the intended recipient please do not > > use > > > or publish its contents, contact Genomics plc immediately at > > > info@xxxxxxxxxxxxxxx <info@xxxxxxxxxxxxxxx> then delete. You may not > > copy, > > > forward, use or disclose the contents of this email to anybody else if > > you > > > are not the intended recipient. Emails are not secure and may contain > > > viruses. > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > -- > -- > Roeland Mertens > Systems Engineer - Genomics PLC > > -- > This email is sent on behalf of Genomics plc, a public limited company > registered in England and Wales with registered number 8839972, VAT > registered number 189 2635 65 and registered office at King Charles House, > Park End Street, Oxford, OX1 1JD, United Kingdom. > The contents of this e-mail and any attachments are confidential to the > intended recipient. If you are not the intended recipient please do not use > or publish its contents, contact Genomics plc immediately at > info@xxxxxxxxxxxxxxx <info@xxxxxxxxxxxxxxx> then delete. You may not copy, > forward, use or disclose the contents of this email to anybody else if you > are not the intended recipient. Emails are not secure and may contain > viruses. -- Cheers, Brad _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com