Hi All, We have a cluster in production who is suffering from intermittent blocked request (25 requests are blocked > 32 sec). The blocked request occurrences are frequent and global to all OSDs. From the OSD daemon logs, I can see related messages: 16-11-11 18:25:29.917518 7fd28b989700 0 log_channel(cluster) log [WRN] : slow request 30.429723 seconds old, received at 2016-11-11 18:24:59.487570: osd_op(client.2406272.1:336025615 rbd_data.66e952ae8944a.0000000000350167 [set-alloc-hint
object_size 4194304 write_size 4194304,write 0~524288] 0.8d3c9da5 snapc 248=[248,216] ondisk+write e201514) currently waiting for subops from 210,499,821 . So I guess the issue is related to replication process when writing new data on the cluster. Again it is never the same secondary OSDs that are displayed in OSD daemon logs. As a result we are experiencing very important IO Write latency on ceph client side (can be up to 1 hour !!!). We have checked Network health as well as disk health but we wre not able to find any issue. Wanted to know if this issue was already observed or if you have ideas to investigate / WA the issue. Many thanks... Thomas The cluster is composed with 37DN and 851 OSDs and 5 MONs The Ceph clients are accessing the client with RBD Cluster is Hammer 0.94.5 version cluster 1a26e029-3734-4b0e-b86e-ca2778d0c990 health HEALTH_WARN 25 requests are blocked > 32 sec 1 near full osd(s) noout flag(s) set monmap e3: 5 mons at {NVMBD1CGK190D00=10.137.81.13:6789/0,nvmbd1cgy050d00=10.137.78.226:6789/0,nvmbd1cgy070d00=10.137.78.232:6789/0,nvmbd1cgy090d00=10.137.78.228:6789/0,nvmbd1cgy130d00=10.137.78.218:6789/0} election epoch 664, quorum 0,1,2,3,4 nvmbd1cgy130d00,nvmbd1cgy050d00,nvmbd1cgy090d00,nvmbd1cgy070d00,NVMBD1CGK190D00 osdmap e205632: 851 osds: 850 up, 850 in flags noout pgmap v25919096: 10240 pgs, 1 pools, 197 TB data, 50664 kobjects 597 TB used, 233 TB / 831 TB avail 10208 active+clean 32 active+clean+scrubbing+deep client io 97822 kB/s rd, 205 MB/s wr, 2402 op/s Thank you Thomas Danan Director of Product Development Office +33 1 49 03 77 53 Mobile +33 7 76 35 76 43 Skype thomas.danan Follow us on Twitter, LinkedIn, YouTube and our Blog This electronic message contains information from Mycom which may be privileged or confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution or any other use of the contents of this information is prohibited. If you have received this electronic message in error, please notify us by post or telephone (to the numbers or correspondence address above) or by email (at the email address above) immediately. |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com