Hi I'll check the possibility on testing EnhanceIO. I'll report back on this. Thanks Br,T -----Original Message----- From: Mark Nelson [mailto:mnelson@xxxxxxxxxx] Sent: 1. heinäkuuta 2015 21:51 To: Tuomas Juntunen; ceph-users@xxxxxxxxxxxxxx Subject: Re: Very low 4k randread performance ~1000iops On 07/01/2015 01:39 PM, Tuomas Juntunen wrote: > Thanks Mark > > Are there any plans for ZFS like L2ARC to CEPH or is the cache tiering > what should work like this in the future? > > I have tested cache tier + EC pool, and that created too much load on > our servers, so it was not viable to be used. We are doing a lot of work in this space right now. Hopefully we'll see improvements coming in the coming releases. > > I was also wondering if EnhanceIO would be a good solution for getting > more random iops. I've read some Sébastien's writings. Possibly! Try it and let us know. ;) > > Br, > Tuomas > > > -----Original Message----- > From: Mark Nelson [mailto:mnelson@xxxxxxxxxx] > Sent: 1. heinäkuuta 2015 20:29 > To: Tuomas Juntunen; ceph-users@xxxxxxxxxxxxxx > Subject: Re: Very low 4k randread performance ~1000iops > > On 07/01/2015 12:13 PM, Tuomas Juntunen wrote: >> Hi >> >> Yes, the OSD's are on spinning disks and we have 18 SSD's for >> journal, one SSD for two OSD's >> >> The OSD's are: >> Model Family: Seagate Barracuda 7200.14 (AF) >> Device Model: ST2000DM001-1CH164 >> >> What I've understood the journals are not used as read cache at all, >> just for writing. Would SSD based cache pool be viable solution here? > > Ok, so that makes more sense. The performance is still lower than > expected but maybe 3-4x rather than several orders of magnitude. My > guess is that cache tiering in it's current form probably won't help > you much unless you have a workload that fits mostly into the cache. > The promotion penalty is really high though so we likely will have to > promote much more slowly than we currently do. > > Mark > >> >> Br, T >> >> -----Original Message----- >> From: Mark Nelson [mailto:mnelson@xxxxxxxxxx] >> Sent: 1. heinäkuuta 2015 13:58 >> To: Tuomas Juntunen; ceph-users@xxxxxxxxxxxxxx >> Subject: Re: Very low 4k randread performance ~1000iops >> >> >> >> On 06/30/2015 10:42 PM, Tuomas Juntunen wrote: >>> Hi >>> >>> For seq reads here's the latencies: >>> lat (usec) : 2=0.01%, 10=0.01%, 20=0.01%, 50=0.02%, 100=0.03% >>> lat (usec) : 250=1.02%, 500=87.09%, 750=7.47%, 1000=1.50% >>> lat (msec) : 2=0.76%, 4=1.72%, 10=0.19%, 20=0.19% >>> >>> Random reads: >>> lat (usec) : 10=0.01% >>> lat (msec) : 2=0.01%, 4=0.01%, 10=0.02%, 20=0.03%, 50=0.55% >>> lat (msec) : 100=99.31%, 250=0.08% >>> >>> 100msecs seems a lot to me. >> >> It is, but what's more interesting imho is that it's so consistent. >> You don't have some ops completing fast and other ones completing >> slowly > holding >> everything up. It's like the OSDs are simply overloaded with >> concurrent > IOs >> and everything is waiting. Maybe I'm confused, are your OSDs on SSDs? > Are >> there spinning disks involved? If so, what model(s)? >> >> You might want to use "collectl -sD -oT" on one of the OSD nodes >> during > the >> test and see what the IO to the disk looks like during random reads >> and > the >> especially with the svctime for the disks is like. >> >> Mark >> >>> >>> Br,T >>> >>> -----Original Message----- >>> From: Mark Nelson [mailto:mnelson@xxxxxxxxxx] >>> Sent: 30. kesäkuuta 2015 22:01 >>> To: Tuomas Juntunen; ceph-users@xxxxxxxxxxxxxx >>> Subject: Re: Very low 4k randread performance ~1000iops >>> >>> Seems reasonable. What's the latency distribution look like in your >>> fio output file? Would be useful to know if it's universally slow >>> or if some ops are taking much longer to complete than others. >>> >>> Mark >>> >>> On 06/30/2015 01:27 PM, Tuomas Juntunen wrote: >>>> I created a file which has the following parameters >>>> >>>> >>>> [random-read] >>>> rw=randread >>>> size=128m >>>> directory=/root/asd >>>> ioengine=libaio >>>> bs=4k >>>> #numjobs=8 >>>> iodepth=64 >>>> >>>> >>>> Br,T >>>> -----Original Message----- >>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On >>>> Behalf Of Mark Nelson >>>> Sent: 30. kesäkuuta 2015 20:55 >>>> To: ceph-users@xxxxxxxxxxxxxx >>>> Subject: Re: Very low 4k randread performance >>>> ~1000iops >>>> >>>> Hi Tuomos, >>>> >>>> Can you paste the command you ran to do the test? >>>> >>>> Thanks, >>>> Mark >>>> >>>> On 06/30/2015 12:18 PM, Tuomas Juntunen wrote: >>>>> Hi >>>>> >>>>> It?s not probably hitting the disks, but that really doesn?t matter. >>>>> The point is we have very responsive VM?s while writing and that >>>>> is what the users will see. >>>>> >>>>> The iops we get with sequential read is good, but the random read >>>>> is way too low. >>>>> >>>>> Is using SSD?s as OSD?s the only way to get it up? or is there >>>>> some tunable which would enhance it? I would assume Linux caches >>>>> reads in memory and serves them from there, but atleast now we don?t see it. >>>>> >>>>> Br, >>>>> >>>>> Tuomas >>>>> >>>>> *From:*Somnath Roy [mailto:Somnath.Roy@xxxxxxxxxxx] >>>>> *Sent:* 30. kesäkuuta 2015 19:24 >>>>> *To:* Tuomas Juntunen; 'ceph-users' >>>>> *Subject:* RE: Very low 4k randread performance >>>>> ~1000iops >>>>> >>>>> Break it down, try fio-rbd to see what is the performance you getting.. >>>>> >>>>> But, I am really surprised you are getting > 100k iops for write, >>>>> did you check it is hitting the disks ? >>>>> >>>>> Thanks & Regards >>>>> >>>>> Somnath >>>>> >>>>> *From:*ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] *On >>>>> Behalf Of *Tuomas Juntunen >>>>> *Sent:* Tuesday, June 30, 2015 8:33 AM >>>>> *To:* 'ceph-users' >>>>> *Subject:* Very low 4k randread performance ~1000iops >>>>> >>>>> Hi >>>>> >>>>> I have been trying to figure out why our 4k random reads in VM?s >>>>> are so bad. I am using fio to test this. >>>>> >>>>> Write : 170k iops >>>>> >>>>> Random write : 109k iops >>>>> >>>>> Read : 64k iops >>>>> >>>>> Random read : 1k iops >>>>> >>>>> Our setup is: >>>>> >>>>> 3 nodes with 36 OSDs, 18 SSD?s one SSD for two OSD?s, each node >>>>> has 64gb mem & 2x6core cpu?s >>>>> >>>>> 4 monitors running on other servers >>>>> >>>>> 40gbit infiniband with IPoIB >>>>> >>>>> Openstack : Qemu-kvm for virtuals >>>>> >>>>> Any help would be appreciated >>>>> >>>>> Thank you in advance. >>>>> >>>>> Br, >>>>> >>>>> Tuomas >>>>> >>>>> ------------------------------------------------------------------ >>>>> -- >>>>> - >>>>> - >>>>> -- >>>>> >>>>> >>>>> PLEASE NOTE: The information contained in this electronic mail >>>>> message is intended only for the use of the designated >>>>> recipient(s) named >>> above. >>>>> If the reader of this message is not the intended recipient, you >>>>> are hereby notified that you have received this message in error >>>>> and that any review, dissemination, distribution, or copying of >>>>> this message is strictly prohibited. If you have received this >>>>> communication in error, please notify the sender by telephone or >>>>> e-mail (as shown >>>>> above) immediately and destroy any and all copies of this message >>>>> in your possession (whether hard copies or electronically stored copies). >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@xxxxxxxxxxxxxx >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >> > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com