rbd latency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,

We have a production Ceph cluster with 12 OSDs spread over 6 hosts running version 0.72.2.

From time to time, we're seeing some nasty multi-second latencies (typically 1-3 second, sometimes as high as 5 seconds) inside QEMU VMs for both read and write loads.

The VMs are still responsive - we installed the relevant QEMU patches a long time back for async rbd I/O. At that time we were seeing multi-second VM stalls.

I think all we've managed to do, however, is mask the real underlying problem. Now the VM OS doesn't stall, but a database I/O might sit and wait too long.

For a while it seemed there was a pattern to these spikes in latency, once every 30 minutes or so. We figured it might have something to do with scrubbing and changed the default OSD settings a bit:

[osd]
        osd op threads = 8
        osd op thread timeout = 60
        osd target transaction size = 50
        osd max backfills = 1
        osd recovery max active = 1
        osd journal size = 10000
        osd max scrubs = 1
        osd scrub load threshold = 0.3
        osd scrub min interval = 86400
        osd scrub max interval = 604800
        osd scrub stride = 65536

From what we can tell, it happens all the time now and we're not sure if it's related to cluster load.

We'd obviously like to get a better handle on the problem and would appreciate suggestions on how we can better measure the phenomenon? We get the following rados bench results (1 Gbps network, SATA disks, OSDs on XFS, on the production cluster):

Total time run:         301.425932
Total writes made:      7064
Write size:             4194304
Bandwidth (MB/sec):     93.741

Stddev Bandwidth:       23.1038
Max bandwidth (MB/sec): 136
Min bandwidth (MB/sec): 0
Average Latency:        0.682643
Stddev Latency:         0.497225
Max latency:            3.75956
Min latency:            0.095493

Are those latencies in seconds? What is typical? What should we expect?

I would imagine if we're seeing multi-second latencies, then
something is wrong? General throughput doesn't seem too bad, but as I said, we're worried about the latency here.

We've also tried turning off hardware disk caches (thinking queueing delays for commits requiring barriers might be a problem) and experimented with various I/O schedulers in both the host and VM OSes. So far, we've seen the best results with deadline on the hosts and noop in the VM. It doesn't seem like disabling the on-disk caches made any difference.

Any ideas?

Regards,
Edwin Peer
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux