Re: Squeezing Performance of CEPH

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,

having 2 node for testing allow me to downgrade the replication to 2x (till the production).
SSD have the following product details:

  • sequential read: 540MB/sec
  • sequential write: 520MB/sec

As you state my sequential write should be:

~600 * 2 (copies) * 2 (journal write per copy) / 8 (ssds) = ~225,25MB/s

If you think that 2 copies should be simultaneously on different cards/networks/nodes my calculation are:

~600 * 2 (journal write per copy) / 8 (ssds) = ~112,625MB/s

So yes, I think that they are terrible low (but maybe I miss something), about 20,8% of the theorical speed of an SSD.
Sequential Read are quite low too.
Maybe only Random Read is good.

Any suggestion?



Il 22/06/2017 19:41, Mark Nelson ha scritto:
Hello Massimiliano,

Based on the configuration below, it appears you have 8 SSDs total (2 nodes with 4 SSDs each)?

I'm going to assume you have 3x replication and are you using filestore, so in reality you are writing 3 copies and doing full data journaling for each copy, for 6x writes per client write.  Taking this into account, your per-SSD throughput should be somewhere around:

Sequential write:
~600 * 3 (copies) * 2 (journal write per copy) / 8 (ssds) = ~450MB/s

Sequential read
~3000 / 8 (ssds) = ~375MB/s

Random read
~3337 / 8 (ssds) = ~417MB/s

These numbers are pretty reasonable for SATA based SSDs, though the read throughput is a little low.  You didn't include the model of SSD, but if you look at Intel's DC S3700 which is a fairly popular SSD for ceph:

https://www.intel.com/content/www/us/en/solid-state-drives/ssd-dc-s3700-spec.html

Sequential read is up to ~500MB/s and Sequential write speeds up to 460MB/s.  Not too far off from what you are seeing.  You might try playing with readahead on the OSD devices to see if that improves things at all.  Still, unless I've missed something these numbers aren't terrible.

Mark

On 06/22/2017 12:19 PM, Massimiliano Cuttini wrote:
Hi everybody,

I want to squeeze all the performance of CEPH (we are using jewel 10.2.7).
We are testing a testing environment with 2 nodes having the same
configuration:

  * CentOS 7.3
  * 24 CPUs (12 for real in hyper threading)
  * 32Gb of RAM
  * 2x 100Gbit/s ethernet cards
  * 2x OS dedicated in raid SSD Disks
  * 4x OSD SSD Disks SATA 6Gbit/s

We are already expecting the following bottlenecks:

  * [ SATA speed x n° disks ] = 24Gbit/s
  * [ Networks speed x n° bonded cards ] = 200Gbit/s

So the minimum between them is 24 Gbit/s per node (not taking in account
protocol loss).

24Gbit/s per node x2 = 48Gbit/s of maximum hypotetical theorical gross
speed.

Here are the tests:
///////IPERF2/////// Tests are quite good scoring 88% of the bottleneck.
Note: iperf2 can use only 1 connection from a bond.(it's a well know issue).

    [ ID] Interval       Transfer     Bandwidth
    [ 12]  0.0-10.0 sec  9.55 GBytes  8.21 Gbits/sec
    [  3]  0.0-10.0 sec  10.3 GBytes  8.81 Gbits/sec
    [  5]  0.0-10.0 sec  9.54 GBytes  8.19 Gbits/sec
    [  7]  0.0-10.0 sec  9.52 GBytes  8.18 Gbits/sec
    [  6]  0.0-10.0 sec  9.96 GBytes  8.56 Gbits/sec
    [  8]  0.0-10.0 sec  12.1 GBytes  10.4 Gbits/sec
    [  9]  0.0-10.0 sec  12.3 GBytes  10.6 Gbits/sec
    [ 10]  0.0-10.0 sec  10.2 GBytes  8.80 Gbits/sec
    [ 11]  0.0-10.0 sec  9.34 GBytes  8.02 Gbits/sec
    [  4]  0.0-10.0 sec  10.3 GBytes  8.82 Gbits/sec
    [SUM]  0.0-10.0 sec   103 GBytes  88.6 Gbits/sec

///////RADOS BENCH

Take in consideration the maximum hypotetical speed of 48Gbit/s tests
(due to disks bottleneck), tests are not good enought.

  * Average MB/s in write is almost 5-7Gbit/sec (12,5% of the mhs)
  * Average MB/s in seq read is almost 24Gbit/sec (50% of the mhs)
  * Average MB/s in random read is almost 27Gbit/se (56,25% of the mhs).

Here are the reports.
Write:

    # rados bench -p scbench 10 write --no-cleanup
    Total time run:         10.229369
    Total writes made:      1538
    Write size:             4194304
    Object size:            4194304
    Bandwidth (MB/sec):     601.406
    Stddev Bandwidth:       357.012
    Max bandwidth (MB/sec): 1080
    Min bandwidth (MB/sec): 204
    Average IOPS:           150
    Stddev IOPS:            89
    Max IOPS:               270
    Min IOPS:               51
    Average Latency(s):     0.106218
    Stddev Latency(s):      0.198735
    Max latency(s):         1.87401
    Min latency(s):         0.0225438

sequential read:

    # rados bench -p scbench 10 seq
    Total time run:       2.054359
    Total reads made:     1538
    Read size:            4194304
    Object size:          4194304
    Bandwidth (MB/sec):   2994.61
    Average IOPS          748
    Stddev IOPS:          67
    Max IOPS:             802
    Min IOPS:             707
    Average Latency(s):   0.0202177
    Max latency(s):       0.223319
    Min latency(s):       0.00589238

random read:

    # rados bench -p scbench 10 rand
    Total time run:       10.036816
    Total reads made:     8375
    Read size:            4194304
    Object size:          4194304
    Bandwidth (MB/sec):   3337.71
    Average IOPS:         834
    Stddev IOPS:          78
    Max IOPS:             927
    Min IOPS:             741
    Average Latency(s):   0.0182707
    Max latency(s):       0.257397
    Min latency(s):       0.00469212

//------------------------------------

It's seems like that there are some bottleneck somewhere that we are
understimating.
Can you help me to found it?






_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux