Hi guys, I found out a strange issue in a ceph cluster. This ceph cluster is used for OpenStack cluster, there are 70 VMs in the cluster. There are many slow request(osd_op_complaint_time = 3) in the ceph cache tier with high load( more than 3000 ops, 90% write). After the investigation, I found that sometimes the interval time between waiting_for_osdmap event and reached_pg event in the op will be very long(may be 10 ~ 100 seconds). This means that some ops will stay op_wq queue very long, until the osd op threads process it. What reasons can cause this issue? Before anyone encountered this situation? I think that osdmap,pg lock,object lock can't block osd op thread so long(10s ~ 100 seconds). hardware(total 11 nodes): - 6 x nodes(compute & storage): - 3 x nodes: HUAWEI RH2288H V2 server, 2xE5-2658, 256GB, - 1 x HUAWEI SSD disk 800GB, used for ceph osd - 12 x 1.2TB SAS disk(each raid0 has 4 sas disk), used for ceph - 3 x nodes: HP DL380 Gen9 server, 2xE5-2640, 128GB, - 2 x OCZ SSD disk 800GB, used for ceph - 12 x 1.2TB SAS disk(each raid0 has 4 sas disk), used for ceph - 5 x nodes (compute) os: - HUAWEI server: RHEL 6.4, 2.6.32-358.123.2.openstack - HP DL380 Gen9 server: RHEL 6.5, 2.6.32-431.el6.x86_64 ceph: - ceph 0.80.10 - 9 ssd osd for cache tier - 18 raid0 osd for storage tier - replication size = 2 - pg num = 1024 - osd op thread = 20 - filestore op thread = 20 dump_historic_ops(osd.4 ssd): { "description": "osd_op(client.983894.0:8044023 rbd_data.468d085e74d0.00000000000031b9 [] 4.5605c210 ack+ondisk+write e11409)", "received_at": "2015-12-08 13:31:49.645295", "age": "100.252129", "duration": "25.419458", "type_data": [ "commit sent; apply or cleanup", { "client": "client.983894", "tid": 8044023}, [ { "time": "2015-12-08 13:31:49.645616", "event": "waiting_for_osdmap"}, { "time": "2015-12-08 13:32:01.529474", "event": "reached_pg"}, { "time": "2015-12-08 13:32:01.529557", "event": "started"}, { "time": "2015-12-08 13:32:01.529575", "event": "started"}, { "time": "2015-12-08 13:32:01.529642", "event": "waiting for subops from 25"}, { "time": "2015-12-08 13:32:01.530340", "event": "commit_queued_for_journal_write"}, { "time": "2015-12-08 13:32:01.531270", "event": "write_thread_in_journal_buffer"}, { "time": "2015-12-08 13:32:01.533131", "event": "journaled_completion_queued"}, { "time": "2015-12-08 13:32:01.533412", "event": "sub_op_commit_rec"}, { "time": "2015-12-08 13:32:05.813736", "event": "op_commit"}, { "time": "2015-12-08 13:32:05.814035", "event": "commit_sent"}, { "time": "2015-12-08 13:32:15.064722", "event": "op_applied"}, { "time": "2015-12-08 13:32:15.064753", "event": "done"}]]}]} -- Rongze Zhu -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html