[Single OSD performance on SSD] Can't go over 3, 2K IOPS

amberzhang86@xxxxxxxxx (Jian Zhang) · Mon, 1 Sep 2014 11:53:52 +0800

Somnath,
on the small workload performance, 107k is higher than the theoretical IOPS
of 520, any idea why?

>>Single client is ~14K iops, but scaling as number of clients increases.
10 clients *~107K* iops. ~25 cpu cores are used.

2014-09-01 11:52 GMT+08:00 Jian Zhang <amberzhang86 at gmail.com>:

> Somnath,
> on the small workload performance,
>
>
>
> 2014-08-29 14:37 GMT+08:00 Somnath Roy <Somnath.Roy at sandisk.com>:
>
>   Thanks Haomai !
>>
>> Here is some of the data from my setup.
>>
>>
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> Set up:
>>
>> --------
>>
>>
>>
>> *32 core* cpu with HT enabled, 128 GB RAM, one SSD (both journal and
>> data) -> *one OSD*. 5 client m/c with 12 core cpu and each running two
>> instances of ceph_smalliobench (10 clients total). Network is 10GbE.
>>
>>
>>
>> Workload:
>>
>> -------------
>>
>>
>>
>> Small workload ? 20K objects with 4K size and io_size is also *4K RR*.
>> The intent is to serve the ios from memory so that it can uncover the
>> performance problems within single OSD.
>>
>>
>>
>> Results from Firefly:
>>
>> --------------------------
>>
>>
>>
>> Single client throughput is ~14K iops, but as the number of client
>> increases the aggregated throughput is not increasing. 10 clients *~15K*
>> iops. ~9-10 cpu cores are used.
>>
>>
>>
>> Result with latest master:
>>
>> ------------------------------
>>
>>
>>
>> Single client is ~14K iops, but scaling as number of clients increases.
>> 10 clients *~107K* iops. ~25 cpu cores are used.
>>
>>
>>
>>
>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>>
>>
>> More realistic workload:
>>
>> -----------------------------
>>
>> Let?s see how it is performing while > 90% of the ios are served from
>> disks
>>
>>  Setup:
>>
>> -------
>>
>> 40 cpu core server as a cluster node (single node cluster) with 64 GB
>> RAM. 8 SSDs -> *8 OSDs*. One similar node for monitor and rgw. Another
>> node for client running fio/vdbench. 4 rbds are configured with ?noshare?
>> option. 40 GbE network
>>
>>
>>
>> Workload:
>>
>> ------------
>>
>>
>>
>> 8 SSDs are populated , so, 8 * 800GB = *~6.4 TB* of data.  Io_size = *4K
>> RR*.
>>
>>
>>
>> Results from Firefly:
>>
>> ------------------------
>>
>>
>>
>> Aggregated output while 4 rbd clients stressing the cluster in parallel
>> is *~20-25K IOPS* , cpu cores used ~8-10 cores (may be less can?t
>> remember precisely)
>>
>>
>>
>> Results from latest master:
>>
>> --------------------------------
>>
>>
>>
>> Aggregated output while 4 rbd clients stressing the cluster in parallel
>> is *~120K IOPS* , cpu is 7% idle i.e  ~37-38 cpu cores.
>>
>>
>>
>> Hope this helps.
>>
>>
>>
>> Thanks & Regards
>>
>> Somnath
>>
>>
>>
>> -----Original Message-----
>> From: Haomai Wang [mailto:haomaiwang at gmail.com]
>> Sent: Thursday, August 28, 2014 8:01 PM
>> To: Somnath Roy
>> Cc: Andrey Korolyov; ceph-users at lists.ceph.com
>> Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over
>> 3, 2K IOPS
>>
>>
>>
>> Hi Roy,
>>
>>
>>
>> I already scan your merged codes about "fdcache" and "optimizing for
>> lfn_find/lfn_open", could you give some performance improvement data about
>> it? I fully agree with your orientation, do you have any update about it?
>>
>>
>>
>> As for messenger level, I have some very early works on it(
>> https://github.com/yuyuyu101/ceph/tree/msg-event), it contains a new
>> messenger implementation which support different event mechanism.
>>
>> It looks like at least one more week to make it work.
>>
>>
>>
>> On Fri, Aug 29, 2014 at 5:48 AM, Somnath Roy <Somnath.Roy at sandisk.com>
>> wrote:
>>
>> > Yes, what I saw the messenger level bottleneck is still huge !
>>
>> > Hopefully RDMA messenger will resolve that and the performance gain
>> will be significant for Read (on SSDs). For write we need to uncover the
>> OSD bottlenecks first to take advantage of the improved upstream.
>>
>> > What I experienced that till you remove the very last bottleneck the
>> performance improvement will not be visible and that could be confusing
>> because you might think that the upstream improvement you did is not valid
>> (which is not).
>>
>> >
>>
>> > Thanks & Regards
>>
>> > Somnath
>>
>> > -----Original Message-----
>>
>> > From: Andrey Korolyov [mailto:andrey at xdel.ru <andrey at xdel.ru>]
>>
>> > Sent: Thursday, August 28, 2014 12:57 PM
>>
>> > To: Somnath Roy
>>
>> > Cc: David Moreau Simard; Mark Nelson; ceph-users at lists.ceph.com
>>
>> > Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go
>>
>> > over 3, 2K IOPS
>>
>> >
>>
>> > On Thu, Aug 28, 2014 at 10:48 PM, Somnath Roy <Somnath.Roy at sandisk.com>
>> wrote:
>>
>> >> Nope, this will not be back ported to Firefly I guess.
>>
>> >>
>>
>> >> Thanks & Regards
>>
>> >> Somnath
>>
>> >>
>>
>> >
>>
>> > Thanks for sharing this, the first thing in thought when I looked at
>>
>> > this thread, was your patches :)
>>
>> >
>>
>> > If Giant will incorporate them, both the RDMA support and those should
>> give a huge performance boost for RDMA-enabled Ceph backnets.
>>
>> >
>>
>> > ________________________________
>>
>> >
>>
>> > PLEASE NOTE: The information contained in this electronic mail message
>> is intended only for the use of the designated recipient(s) named above. If
>> the reader of this message is not the intended recipient, you are hereby
>> notified that you have received this message in error and that any review,
>> dissemination, distribution, or copying of this message is strictly
>> prohibited. If you have received this communication in error, please notify
>> the sender by telephone or e-mail (as shown above) immediately and destroy
>> any and all copies of this message in your possession (whether hard copies
>> or electronically stored copies).
>>
>> >
>>
>> > _______________________________________________
>>
>> > ceph-users mailing list
>>
>> > ceph-users at lists.ceph.com
>>
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Best Regards,
>>
>>
>>
>> Wheat
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140901/58817b42/attachment.htm>