Infernalis -> Jewel, 10x+ RBD latency increase

Martin Millnert <martin@xxxxxxxxxxx> · Fri, 22 Jul 2016 01:32:36 +0200

Hi,

I just upgraded from Infernalis to Jewel and see an approximate 10x
latency increase.

Quick facts:
 - 3x replicated pool
 - 4x 2x-"E5-2690 v3 @ 2.60GHz", 128GB RAM, 6x 1.6 TB Intel S3610 SSDs,
 - LSI3008 controller with up-to-date firmware and upstream driver, and
up-to-date firmware on SSDs.
 - 40GbE (Mellanox, with up-to-date drivers & firmware)
 - CentOS 7.2

Physical checks out, both iperf3 for network and e.g. fio over all the
SSDs. Not done much of Linux tuning yet; but irqbalanced does a pretty
good job with pairing both NIC and HBA with their respective CPUs.

In performance hunting mode, and today took the next logical step of
upgrading from Infernalis to Jewel.

Tester is remote KVM/Qemu/libvirt guest (openstack) CentOS 7 image with
fio. The test scenario is 4K randomwrite, libaio, directIO, QD=1,
runtime=900s, test-file-size=40GiB.

Went from a picture of [1] to [2]. In [1], the guest saw 98.25% of the
I/O complete within maximum 250 µsec (~4000 IOPS). This, [2], sees
98.95% of the IO at ~4 msec (actually ~300 IOPs).

Between [1] and [2] (simple plots of FIO's E2E-latency metrics), the
entire cluster including compute nodes code went from Infernalis to
10.2.2

What's going on here?

I haven't tuned Ceph OSDs either in config or via Linux kernel at all
yet; upgrade to Jewel came first. I haven't changed any OSD configs
between [1] and [2] myself (only minimally before [1], 0 effort on
performance tuning) , other than updated to Jewel tunables. But the
difference is very drastic, wouldn't you say?

Best,
Martin
[1] http://martin.millnert.se/ceph/pngs/guest-ceph-fio-bench/test08/ceph-fio-bench_lat.1.png
[2] http://martin.millnert.se/ceph/pngs/guest-ceph-fio-bench/test10/ceph-fio-bench_lat.1.png
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com