Re: Ceph performance with 8K blocks.

Jason Villalta <jason@xxxxxxxxxxxx> · Fri, 20 Sep 2013 16:58:34 -0400

Mike,So I do have to ask, where would the extra latency be coming from if all my OSDs are on the same machine that my test VM is running on?  I have tried every SSD tweak in the book.  The primary concerning issue I see is with Read performance of sequential IOs in the 4-8K range.  I would expect those to pull from three SSD disks on a local machine atleast as fast one Native SDD test.  But I don't see that, its actually slower.

On Wed, Sep 18, 2013 at 4:02 PM, Jason Villalta <jason@xxxxxxxxxxxx> wrote:

Thank Mike,High hopes right ;)

I guess we are not doing too bad compared to you numbers then.  Just wish the gap was a little closer between native and ceph per osd.

C:\Program Files (x86)\SQLIO>sqlio -kW -t8 -s30 -o8 -fsequential -b1024 -BH -LS
c:\TestFile.dat
sqlio v1.5.SG
using system counter for latency timings, 100000000 counts per second

8 threads writing for 30 secs to file c:\TestFile.dat
        using 1024KB sequential IOs
        enabling multiple I/Os per thread with 8 outstanding
        buffering set to use hardware disk cache (but not file cache)

using current size: 10240 MB for file: c:\TestFile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec:   180.20
MBs/sec:   180.20
latency metrics:

Min_Latency(ms): 39
Avg_Latency(ms): 352
Max_Latency(ms): 692
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%:  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 100

On Wed, Sep 18, 2013 at 3:55 PM, Mike Lowe <j.michael.lowe@xxxxxxxxx> wrote:

Well, in a word, yes. You really expect a network replicated storage system in user space to be comparable to direct attached ssd storage?  For what it's worth, I've got a pile of regular spinning rust, this is what my cluster will do inside a vm with rbd writeback caching on.  As you can see, latency is everything.

dd if=/dev/zero of=1g bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 6.26289 s, 171 MB/s
dd if=/dev/zero of=1g bs=1M count=1024 oflag=dsync

1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 37.4144 s, 28.7 MB/s

As you can see, latency is a killer.

On Sep 18, 2013, at 3:23 PM, Jason Villalta <jason@xxxxxxxxxxxx> wrote:

Any other thoughts on this thread guys.  I am just crazy to want near native SSD performance on a small SSD cluster?

On Wed, Sep 18, 2013 at 8:21 AM, Jason Villalta <jason@xxxxxxxxxxxx> wrote:

That dd give me this.
dd if=ddbenchfile of=- bs=8K | dd if=- of=/dev/null bs=8K

8192000000 bytes (8.2 GB) copied, 31.1807 s, 263 MB/s 

Which makes sense because the SSD is running as SATA 2 which should give 3Gbps or ~300MBps

I am still trying to better understand the speed difference between the small block speeds seen with dd vs the same small object size with rados.  It is not a difference of a few MB per sec.  It seems to nearly be a factor of 10.  I just want to know if this is a hard limit in Ceph or a factor of the underlying disk speed.  Meaning if I use spindles to read data would the speed be the same or would the read speed be a factor of 10 less than the speed of the underlying disk?

On Wed, Sep 18, 2013 at 4:27 AM, Alex Bligh <alex@xxxxxxxxxxx> wrote:

On 17 Sep 2013, at 21:47, Jason Villalta wrote:

> dd if=ddbenchfile of=/dev/null bs=8K

> 8192000000 bytes (8.2 GB) copied, 19.7318 s, 415 MB/s

As a general point, this benchmark may not do what you think it does, depending on the version of dd, as writes to /dev/null can be heavily optimised.

Try:

  dd if=ddbenchfile of=- bs=8K | dd if=- of=/dev/null bs=8K

--

Alex Bligh

-- 
-- 

Jason Villalta
Co-founder
<EmailLogo.png>
800.799.4407x1230 | www.RubixTechnology.com

-- 
-- 

Jason Villalta
Co-founder
<EmailLogo.png>
800.799.4407x1230 | www.RubixTechnology.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
-- 

Jason Villalta
Co-founder
800.799.4407x1230 | www.RubixTechnology.com

-- 
-- 

Jason Villalta
Co-founder
800.799.4407x1230 | www.RubixTechnology.com

Attachment:
EmailLogo.png

Description: PNG image
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com