Re: Measuring write latency (ceph osd perf)

Laimis Juzeliūnas <laimis.juzeliunas@xxxxxxxxxx> · Fri, 3 Jan 2025 11:48:43 +0200

Hello Jan,

A low hanging fruit but there are some built-in Grafana dashboards inside Ceph that show OSD latencies, apply and commit ones in particular are useful for write latencies. If you have prometheus metrics exported there are a few good community dashboards for disk performance overview: https://grafana.com/grafana/dashboards/?search=ceph+osd

One amazing piece of advice for HDD disks is to disable write cache, mentioned here by Dan:
https://youtu.be/2I_U2p-trwI?t=889;
Ceph Days NYC 2023: Ceph at CERN: A Ten-Year Retrospective
youtu.be

We observed write latencies drop almost in half, especially during more heavy backfilling operations.
Some metrics from a single HDD disk:

Best,
Laimis J.
laimis.juzeliunas@xxxxxxxxxx

> On 3 Jan 2025, at 11:37, Jan Kasprzak <kas@xxxxxxxxxx> wrote:
> 
> 	Hello, ceph users,
> 
> TL;DR: how can I look into ceph cluster write latency issues?
> 
> Details: we have a HDD-based cluster (with NVMe for metadata), about 20 hosts,
> 2 OSD per host, mostly used as RBD storage for QEMU/KVM virtual machines.
> From time to time our users complain about write latencies inside their VMs.
> 
> I would like to be able to see when the cluster is overloaded or when
> the write latency is bad.
> 
> What did I try so far:
> 
> 1) fio inside the KVM virtual machine:
> fio --ioengine=libaio --direct=1 --rw=write --numjobs=1 --bs=1M --iodepth=16 --size=5G --name=/var/tmp/fio-test
> [...]
>  write: IOPS=63, BW=63.3MiB/s (66.4MB/s)(5120MiB/80863msec); 0 zone resets
> 
> - I am usually getting about 60 to 150 IOPS for 1MB writes
> 
> 2) PostgreSQL from the KVM virtual machine, running many tiny INSERTs
> as separate transactions for about 10 seconds. This is where I clearly see
> latency spikes:
> 
> Wed Dec 18 09:20:21 PM CET 2024 406.062 txn/s
> Wed Dec 18 09:25:21 PM CET 2024 318.974 txn/s
> Wed Dec 18 09:30:21 PM CET 2024 285.591 txn/s
> Wed Dec 18 09:35:21 PM CET 2024 191.804 txn/s
> Wed Dec 18 09:40:22 PM CET 2024 246.679 txn/s
> Wed Dec 18 09:45:22 PM CET 2024 201.005 txn/s
> Wed Dec 18 09:50:22 PM CET 2024 153.206 txn/s
> Wed Dec 18 09:55:22 PM CET 2024 124.546 txn/s
> Wed Dec 18 10:00:23 PM CET 2024 33.094 txn/s
> Wed Dec 18 10:05:23 PM CET 2024 82.659 txn/s
> Wed Dec 18 10:10:23 PM CET 2024 292.544 txn/s
> Wed Dec 18 10:15:24 PM CET 2024 453.366 txn/s
> 
> The drawback of both fio and postgresql benchmark is that I am
> unnecessarily loading the cluster with additional work, just to measure
> latency. And I am not covering the whole cluster, just the OSDs on which
> that VM happens to have its own data.
> 
> 3) ceph osd perf
> I don't see any single obviously overloaded OSD here, but the latencies vary
> nevertheless. Here are statistics computed across all OSDs from
> the "ceph osd perf" output:
> 
> Fri Jan  3 10:12:41 AM CET 2025
> average	9	9
> median	5	5
> 3rd-q	13	13
> max	70	70
> Fri Jan  3 10:13:42 AM CET 2025
> average	5	5
> median	3	3
> 3rd-q	10	10
> max	31	31
> Fri Jan  3 10:14:42 AM CET 2025
> average	3	3
> median	2	2
> 3rd-q	4	4
> max	19	19
> Fri Jan  3 10:15:42 AM CET 2025
> average	5	5
> median	1	1
> 3rd-q	3	3
> max	63	63
> 
> However, I am not sure what exactly these numbers actually mean
> - what timespan do they cover? I would like to have something like
> "in the last 5 minutes, 99 % of all writes committed under XXX ms".
> Can Ceph tell me that?
> 
> What else apart from buying faster hardware can I try in order
> to improve the write latency for QEMU/KVM-based VMs with RBD images?
> 
> Thanks for any hints.
> 
> -Yenya
> 
> -- 
> | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
> | https://www.google.com/url?q=https://www.fi.muni.cz/~kas/&source=gmail-imap&ust=1736501973000000&usg=AOvVaw3q59NOoxinxTTeGpvAb3dv                        GPG: 4096R/A45477D5 |
>    We all agree on the necessity of compromise. We just can't agree on
>    when it's necessary to compromise.                     --Larry Wall
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx