op_w_latency

Glen Baars <glen@xxxxxxxxxxxxxxxxxxxxxx> · Mon, 1 Apr 2019 11:32:29 +0000

Hello Ceph Users,

I am finding that the write latency across my ceph clusters isn’t great and I wanted to see what other people are getting for op_w_latency. Generally I am getting 70-110ms latency.

I am using: ceph --admin-daemon /var/run/ceph/ceph-osd.102.asok perf dump | grep -A3 '\"op_w_latency' | grep 'avgtime'

Ram, CPU and network don’t seem to be the bottleneck. The drives are behind a dell H810p raid card with a 1GB writeback cache and battery. I have tried with LSI JBOD cards and haven’t found it faster ( as you would expect
 with write cache ). The disks through iostat -xyz 1 show 10-30% usage with general service + write latency around 3-4ms. Queue depth is normally less than one. RocksDB write latency is around 0.6ms, read 1-2ms. Usage is RBD backend for Cloudstack.

Dumping the ops seems to show the latency here: (ceph --admin-daemon /var/run/ceph/ceph-osd.102.asok dump_historic_ops_by_duration  |less)

                    {
                        "time": "2019-04-01 22:24:38.432000",
                        "event": "queued_for_pg"
                    },
                    {
                        "time": "2019-04-01 22:24:38.438691",
                        "event": "reached_pg"
                    },
                    {
                        "time": "2019-04-01 22:24:38.438740",
                        "event": "started"
                    },
                    {
                        "time": "2019-04-01 22:24:38.727820",
                        "event": "sub_op_started"
                    },
                    {
                        "time": "2019-04-01 22:24:38.728448",
                        "event": "sub_op_committed"
                    },
                    {
                        "time": "2019-04-01 22:24:39.129175",
                        "event": "commit_sent"
                    },
                    {
                        "time": "2019-04-01 22:24:39.129231",
                        "event": "done"
                    }
                ]
            }
        }

This write was around a very slow one and I am wondering if I have a few ops that are taking along time and most that are good….

What else can I do to figure out where the issue is?

This e-mail is intended solely for the benefit of the addressee(s) and any other named recipient. It is confidential and may contain legally privileged or confidential information. If you are not the recipient, any use, distribution, disclosure or copying of
 this e-mail is prohibited. The confidentiality and legal privilege attached to this communication is not waived or lost by reason of the mistaken transmission or delivery to you. If you have received this e-mail in error, please notify us immediately.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com