Hello Ceph Users, I am finding that the write latency across my ceph clusters isn’t great and I wanted to see what other people are getting for op_w_latency. Generally I am getting 70-110ms latency. I am using: ceph --admin-daemon /var/run/ceph/ceph-osd.102.asok perf dump | grep -A3 '\"op_w_latency' | grep 'avgtime' Ram, CPU and network don’t seem to be the bottleneck. The drives are behind a dell H810p raid card with a 1GB writeback cache and battery. I have tried with LSI JBOD cards and haven’t found it faster ( as you would expect
with write cache ). The disks through iostat -xyz 1 show 10-30% usage with general service + write latency around 3-4ms. Queue depth is normally less than one. RocksDB write latency is around 0.6ms, read 1-2ms. Usage is RBD backend for Cloudstack. Dumping the ops seems to show the latency here: (ceph --admin-daemon /var/run/ceph/ceph-osd.102.asok dump_historic_ops_by_duration |less) { "time": "2019-04-01 22:24:38.432000", "event": "queued_for_pg" }, { "time": "2019-04-01 22:24:38.438691", "event": "reached_pg" }, { "time": "2019-04-01 22:24:38.438740", "event": "started" }, { "time": "2019-04-01 22:24:38.727820", "event": "sub_op_started" }, { "time": "2019-04-01 22:24:38.728448", "event": "sub_op_committed" }, { "time": "2019-04-01 22:24:39.129175", "event": "commit_sent" }, { "time": "2019-04-01 22:24:39.129231", "event": "done" } ] } } This write was around a very slow one and I am wondering if I have a few ops that are taking along time and most that are good…. What else can I do to figure out where the issue is? |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com