Re: Infernalis -> Jewel, 10x+ RBD latency increase

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Martin Millnert
> Sent: 22 July 2016 00:33
> To: Ceph Users <ceph-users@xxxxxxxxxxxxxx>
> Subject:  Infernalis -> Jewel, 10x+ RBD latency increase
> 
> Hi,
> 
> I just upgraded from Infernalis to Jewel and see an approximate 10x latency increase.
> 
> Quick facts:
>  - 3x replicated pool
>  - 4x 2x-"E5-2690 v3 @ 2.60GHz", 128GB RAM, 6x 1.6 TB Intel S3610 SSDs,
>  - LSI3008 controller with up-to-date firmware and upstream driver, and up-to-date firmware on SSDs.
>  - 40GbE (Mellanox, with up-to-date drivers & firmware)
>  - CentOS 7.2
> 
> Physical checks out, both iperf3 for network and e.g. fio over all the SSDs. Not done much of Linux tuning yet; but irqbalanced does a
> pretty good job with pairing both NIC and HBA with their respective CPUs.
> 
> In performance hunting mode, and today took the next logical step of upgrading from Infernalis to Jewel.
> 
> Tester is remote KVM/Qemu/libvirt guest (openstack) CentOS 7 image with fio. The test scenario is 4K randomwrite, libaio, directIO,
> QD=1, runtime=900s, test-file-size=40GiB.
> 
> Went from a picture of [1] to [2]. In [1], the guest saw 98.25% of the I/O complete within maximum 250 µsec (~4000 IOPS). This, [2],
> sees 98.95% of the IO at ~4 msec (actually ~300 IOPs).

I would be suspicious that somehow somewhere you had some sort of caching going on, in the 1st example. 250us is pretty much unachievable for directio writes with Ceph. I've just built some new nodes with the pure goal of crushing (excuse the pun) write latency and after extensive tuning can't get it much below 600-700us. The 4ms sounds more likely for an untuned cluster. I wonder if any of the RBD or qemu cache settings would have changed between versions?

> 
> Between [1] and [2] (simple plots of FIO's E2E-latency metrics), the entire cluster including compute nodes code went from Infernalis
> to
> 10.2.2
> 
> What's going on here?
> 
> I haven't tuned Ceph OSDs either in config or via Linux kernel at all yet; upgrade to Jewel came first. I haven't changed any OSD configs
> between [1] and [2] myself (only minimally before [1], 0 effort on performance tuning) , other than updated to Jewel tunables. But
> the difference is very drastic, wouldn't you say?
> 
> Best,
> Martin
> [1] http://martin.millnert.se/ceph/pngs/guest-ceph-fio-bench/test08/ceph-fio-bench_lat.1.png
> [2] http://martin.millnert.se/ceph/pngs/guest-ceph-fio-bench/test10/ceph-fio-bench_lat.1.png
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux