On Thu, Aug 11, 2016 at 11:33:29PM +0100, Roeland Mertens wrote: > Hi, > > I was hoping someone on this list may be able to help? > > We're running a 35 node 10.2.1 cluster with 595 OSDs. For the last 12 hours > we've been plagued with blocked requests which completely kills the > performance of the cluster > > # ceph health detail > HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs down; 1 > pgs peering; 1 pgs stuck inactive; 100 requests are blocked > 32 sec; 1 osds > have slow requests; noout,nodeep-scrub,sortbitwise flag(s) set > pg 63.1a18 is stuck inactive for 135133.509820, current state > down+remapped+peering, last acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,235,148,290,300,147,157,370] That value (2147483647) is defined in src/crush/crush.h like so; #define CRUSH_ITEM_NONE 0x7fffffff /* no result */ So this could be due to a bad crush rule or maybe choose_total_tries needs to be higher? $ ceph osd crush rule ls For each rule listed by the above command. $ ceph osd crush rule dump [rule_name] I'd then dump out the crushmap and test it showing any bad mappings with the commands listed here; http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#crush-gives-up-too-soon I'd also check the pg numbers for your pool(s) are appropriate as not enough pgs could also be a contributing factor IIRC. That should hopefully give some insight. -- HTH, Brad > pg 63.1a18 is down+remapped+peering, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,235,148,290,300,147,157,370] > 100 ops are blocked > 2097.15 sec on osd.4 > 1 osds have slow requests > noout,nodeep-scrub,sortbitwise flag(s) set > > the one pg down is due to us running into an odd EC issue which I mailed the > list about earlier, it's the 100 blocked ops that are puzzling us. If we out > the osd in question, they just shift to another osd (on a different host!). > We even tried rebooting the node it's on but to little avail. > > We get a ton of log messages like this: > > 2016-08-11 23:32:10.041174 7fc668d9f700 0 log_channel(cluster) log [WRN] : > 100 slow requests, 5 included below; oldest blocked for > 139.313915 secs > 2016-08-11 23:32:10.041184 7fc668d9f700 0 log_channel(cluster) log [WRN] : > slow request 139.267004 seconds old, received at 2016-08-11 23:29:50.774091: > osd_op(client.9192464.0:485640 66.b96c3a18 > default.4282484.42_442fac8195c63a2e19c3c4bb91e8800e [getxattrs,stat,read > 0~524288] snapc 0=[] RETRY=36 ack+retry+read+known_if_redirected e50109) > currently waiting for blocked object > 2016-08-11 23:32:10.041189 7fc668d9f700 0 log_channel(cluster) log [WRN] : > slow request 139.244839 seconds old, received at 2016-08-11 23:29:50.796256: > osd_op(client.9192464.0:596033 66.942a5a18 > default.4282484.30__shadow_.sLkZ_rUX6cvi0ifFasw1UipEIuFPzYB_6 [write > 1048576~524288] snapc 0=[] RETRY=36 > ack+ondisk+retry+write+known_if_redirected e50109) currently waiting for > blocked object > > A dump of the blocked ops tells us very little , is there anyone who can > shed some light on this? Or at least give us a hint on how we can fix this? > > # ceph daemon osd.4 dump_blocked_ops > .... > > { > "description": "osd_op(client.9192464.0:596030 66.942a5a18 > default.4282484.30__shadow_.sLkZ_rUX6cvi0ifFasw1UipEIuFPzYB_6 [writefull > 0~0] snapc 0=[] RETRY=32 ack+ondisk+retry+write+known_if_redirected > e50092)", > "initiated_at": "2016-08-11 22:58:09.721027", > "age": 1515.105186, > "duration": 1515.113255, > "type_data": [ > "reached pg", > { > "client": "client.9192464", > "tid": 596030 > }, > [ > { > "time": "2016-08-11 22:58:09.721027", > "event": "initiated" > }, > { > "time": "2016-08-11 22:58:09.721066", > "event": "waiting_for_map not empty" > }, > { > "time": "2016-08-11 22:58:09.813574", > "event": "reached_pg" > }, > { > "time": "2016-08-11 22:58:09.813581", > "event": "waiting for peered" > }, > { > "time": "2016-08-11 22:58:09.852796", > "event": "reached_pg" > }, > { > "time": "2016-08-11 22:58:09.852804", > "event": "waiting for peered" > }, > { > "time": "2016-08-11 22:58:10.876636", > "event": "reached_pg" > }, > { > "time": "2016-08-11 22:58:10.876640", > "event": "waiting for peered" > }, > { > "time": "2016-08-11 22:58:10.902760", > "event": "reached_pg" > } > ] > ] > } > ... > > > Kind regards, > > > Roeland > > > -- > This email is sent on behalf of Genomics plc, a public limited company > registered in England and Wales with registered number 8839972, VAT > registered number 189 2635 65 and registered office at King Charles House, > Park End Street, Oxford, OX1 1JD, United Kingdom. > The contents of this e-mail and any attachments are confidential to the > intended recipient. If you are not the intended recipient please do not use > or publish its contents, contact Genomics plc immediately at > info@xxxxxxxxxxxxxxx <info@xxxxxxxxxxxxxxx> then delete. You may not copy, > forward, use or disclose the contents of this email to anybody else if you > are not the intended recipient. Emails are not secure and may contain > viruses. > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com