Squeezing Performance of CEPH

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everybody,

I want to squeeze all the performance of CEPH (we are using jewel 10.2.7).
We are testing a testing environment with 2 nodes having the same configuration:

  • CentOS 7.3
  • 24 CPUs (12 for real in hyper threading)
  • 32Gb of RAM
  • 2x 100Gbit/s ethernet cards
  • 2x OS dedicated in raid SSD Disks
  • 4x OSD SSD Disks SATA 6Gbit/s

We are already expecting the following bottlenecks:

  • [ SATA speed x n° disks ] = 24Gbit/s
  • [ Networks speed x n° bonded cards ] = 200Gbit/s

So the minimum between them is 24 Gbit/s per node (not taking in account protocol loss).

24Gbit/s per node x2 = 48Gbit/s of maximum hypotetical theorical gross speed.

Here are the tests:
///////IPERF2/////// Tests are quite good scoring 88% of the bottleneck.
Note: iperf2 can use only 1 connection from a bond.(it's a well know issue).

[ ID] Interval       Transfer     Bandwidth
[ 12]  0.0-10.0 sec  9.55 GBytes  8.21 Gbits/sec
[  3]  0.0-10.0 sec  10.3 GBytes  8.81 Gbits/sec
[  5]  0.0-10.0 sec  9.54 GBytes  8.19 Gbits/sec
[  7]  0.0-10.0 sec  9.52 GBytes  8.18 Gbits/sec
[  6]  0.0-10.0 sec  9.96 GBytes  8.56 Gbits/sec
[  8]  0.0-10.0 sec  12.1 GBytes  10.4 Gbits/sec
[  9]  0.0-10.0 sec  12.3 GBytes  10.6 Gbits/sec
[ 10]  0.0-10.0 sec  10.2 GBytes  8.80 Gbits/sec
[ 11]  0.0-10.0 sec  9.34 GBytes  8.02 Gbits/sec
[  4]  0.0-10.0 sec  10.3 GBytes  8.82 Gbits/sec
[SUM]  0.0-10.0 sec   103 GBytes  88.6 Gbits/sec

///////RADOS BENCH

Take in consideration the maximum hypotetical speed of 48Gbit/s tests (due to disks bottleneck), tests are not good enought.

  • Average MB/s in write is almost 5-7Gbit/sec (12,5% of the mhs)
  • Average MB/s in seq read is almost 24Gbit/sec (50% of the mhs)
  • Average MB/s in random read is almost 27Gbit/se (56,25% of the mhs).

Here are the reports.
Write:

# rados bench -p scbench 10 write --no-cleanup
Total time run:         10.229369
Total writes made:      1538
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     601.406
Stddev Bandwidth:       357.012
Max bandwidth (MB/sec): 1080
Min bandwidth (MB/sec): 204
Average IOPS:           150
Stddev IOPS:            89
Max IOPS:               270
Min IOPS:               51
Average Latency(s):     0.106218
Stddev Latency(s):      0.198735
Max latency(s):         1.87401
Min latency(s):         0.0225438

sequential read:

# rados bench -p scbench 10 seq
Total time run:       2.054359
Total reads made:     1538
Read size:            4194304
Object size:          4194304
Bandwidth (MB/sec):   2994.61
Average IOPS          748
Stddev IOPS:          67
Max IOPS:             802
Min IOPS:             707
Average Latency(s):   0.0202177
Max latency(s):       0.223319
Min latency(s):       0.00589238

random read:

# rados bench -p scbench 10 rand
Total time run:       10.036816
Total reads made:     8375
Read size:            4194304
Object size:          4194304
Bandwidth (MB/sec):   3337.71
Average IOPS:         834
Stddev IOPS:          78
Max IOPS:             927
Min IOPS:             741
Average Latency(s):   0.0182707
Max latency(s):       0.257397
Min latency(s):       0.00469212

//------------------------------------

It's seems like that there are some bottleneck somewhere that we are understimating.
Can you help me to found it?




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux