Hi Ashley,
I already know, I was already expecting that the bottleneck was
the minimum between bandwidth and disks (and was currently disk on
my first email).
I thinking that write is still to low.
I read that removing journal overhead is not a good idea.
However I'm writing twice on a SSD... even this seems not a good
idea.
How is possible to remove this overhead?
Il 22/06/2017 19:47, Ashley Merrick ha
scritto:
Hello,
Also as Mark put, one minute your testing bandwidth capacity,
next minute your testing disk capacity.
No way is a small set of SSD’s going to be able to max your
current bandwidth, even if you removed the CEPH / Journal
overhead. I would say the speeds you are getting are what you
should expect , see with many other setups.
,Ashley
Sent from my iPhone
Hello Massimiliano,
Based on the configuration below, it appears you have
8 SSDs total (2 nodes with 4 SSDs each)?
I'm going to assume you have 3x replication and are
you using filestore, so in reality you are writing 3
copies and doing full data journaling for each copy, for
6x writes per client write. Taking this into account,
your per-SSD throughput should be somewhere around:
Sequential write:
~600 * 3 (copies) * 2 (journal write per copy) / 8
(ssds) = ~450MB/s
Sequential read
~3000 / 8 (ssds) = ~375MB/s
Random read
~3337 / 8 (ssds) = ~417MB/s
These numbers are pretty reasonable for SATA based
SSDs, though the read throughput is a little low. You
didn't include the model of SSD, but if you look at
Intel's DC S3700 which is a fairly popular SSD for ceph:
https://www.intel.com/content/www/us/en/solid-state-drives/ssd-dc-s3700-spec.html
Sequential read is up to ~500MB/s and Sequential write
speeds up to 460MB/s. Not too far off from what you are
seeing. You might try playing with readahead on the OSD
devices to see if that improves things at all. Still,
unless I've missed something these numbers aren't
terrible.
Mark
On 06/22/2017 12:19 PM, Massimiliano Cuttini wrote:
Hi everybody,
I want to squeeze all the
performance of CEPH (we are using jewel 10.2.7).
We are testing a testing
environment with 2 nodes having the same
configuration:
* CentOS 7.3
* 24 CPUs (12 for real in
hyper threading)
* 32Gb of RAM
* 2x 100Gbit/s ethernet cards
* 2x OS dedicated in raid SSD
Disks
* 4x OSD SSD Disks SATA
6Gbit/s
We are already expecting the
following bottlenecks:
* [ SATA speed x n° disks ] =
24Gbit/s
* [ Networks speed x n°
bonded cards ] = 200Gbit/s
So the minimum between them is
24 Gbit/s per node (not taking in account
protocol loss).
24Gbit/s per node x2 =
48Gbit/s of maximum hypotetical theorical gross
speed.
Here are the tests:
///////IPERF2/////// Tests are
quite good scoring 88% of the bottleneck.
Note: iperf2 can use only 1
connection from a bond.(it's a well know issue).
[ ID] Interval
Transfer Bandwidth
[ 12] 0.0-10.0 sec 9.55
GBytes 8.21 Gbits/sec
[ 3] 0.0-10.0 sec 10.3
GBytes 8.81 Gbits/sec
[ 5] 0.0-10.0 sec 9.54
GBytes 8.19 Gbits/sec
[ 7] 0.0-10.0 sec 9.52
GBytes 8.18 Gbits/sec
[ 6] 0.0-10.0 sec 9.96
GBytes 8.56 Gbits/sec
[ 8] 0.0-10.0 sec 12.1
GBytes 10.4 Gbits/sec
[ 9] 0.0-10.0 sec 12.3
GBytes 10.6 Gbits/sec
[ 10] 0.0-10.0 sec 10.2
GBytes 8.80 Gbits/sec
[ 11] 0.0-10.0 sec 9.34
GBytes 8.02 Gbits/sec
[ 4] 0.0-10.0 sec 10.3
GBytes 8.82 Gbits/sec
[SUM] 0.0-10.0 sec 103
GBytes 88.6 Gbits/sec
///////RADOS BENCH
Take in consideration the
maximum hypotetical speed of 48Gbit/s tests
(due to disks bottleneck),
tests are not good enought.
* Average MB/s in write is
almost 5-7Gbit/sec (12,5% of the mhs)
* Average MB/s in seq read is
almost 24Gbit/sec (50% of the mhs)
* Average MB/s in random read
is almost 27Gbit/se (56,25% of the mhs).
Here are the reports.
Write:
# rados bench -p scbench 10
write --no-cleanup
Total time run:
10.229369
Total writes made:
1538
Write size:
4194304
Object size:
4194304
Bandwidth (MB/sec):
601.406
Stddev Bandwidth:
357.012
Max bandwidth (MB/sec):
1080
Min bandwidth (MB/sec): 204
Average IOPS: 150
Stddev IOPS: 89
Max IOPS: 270
Min IOPS: 51
Average Latency(s):
0.106218
Stddev Latency(s):
0.198735
Max latency(s):
1.87401
Min latency(s):
0.0225438
sequential read:
# rados bench -p scbench 10
seq
Total time run:
2.054359
Total reads made: 1538
Read size:
4194304
Object size:
4194304
Bandwidth (MB/sec):
2994.61
Average IOPS 748
Stddev IOPS: 67
Max IOPS: 802
Min IOPS: 707
Average Latency(s):
0.0202177
Max latency(s):
0.223319
Min latency(s):
0.00589238
random read:
# rados bench -p scbench 10
rand
Total time run:
10.036816
Total reads made: 8375
Read size:
4194304
Object size:
4194304
Bandwidth (MB/sec):
3337.71
Average IOPS: 834
Stddev IOPS: 78
Max IOPS: 927
Min IOPS: 741
Average Latency(s):
0.0182707
Max latency(s):
0.257397
Min latency(s):
0.00469212
//------------------------------------
It's seems like that there are
some bottleneck somewhere that we are
understimating.
Can you help me to found it?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com