I was wondering the same, from a 'default' setup I get this performance,
no idea if this is bad, good or normal.
4k r ran. |
4k w ran. |
4k r seq. |
4k w seq. |
1024k r ran. |
1024k w ran. |
1024k r seq. |
1024k w seq. | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
size |
lat |
iops |
kB/s |
lat |
iops |
kB/s |
lat |
iops |
MB/s |
lat |
iops |
MB/s |
lat |
iops |
MB/s |
lat |
iops |
MB/s |
lat |
iops |
MB/s |
lat |
iops |
MB/s | |||
Cephfs |
ssd rep. 3 |
2.78 |
1781 |
7297 |
1.42 |
700 |
2871 |
0.29 |
3314 |
13.6 |
0.04 |
889 |
3.64 |
4.3 |
231 |
243 |
0.08 |
132 |
139 |
4.23 |
235 |
247 |
6.99 |
142 |
150 | ||
Cephfs |
ssd rep. 1 |
0.54 |
1809 |
7412 |
0.8 |
1238 |
5071 |
0.29 |
3325 |
13.6 |
0.56 |
1761 |
7.21 |
4.27 |
233 |
245 |
4.34 |
229 |
241 |
4.21 |
236 |
248 |
4.34 |
229 |
241 | ||
Samsung |
MZK7KM480 |
480GB |
0.09 |
10.2k |
41600 |
0.05 |
17.9k |
73200 |
0.05 |
18k |
77.6 |
0.05 |
18.3k |
75.1 |
2.06 |
482 |
506 |
2.16 |
460 |
483 |
1.98 |
502 |
527 |
2.13 |
466 |
489 |
(4 nodes, CentOS7,
luminous)
Ps. not sure why you
test with one node. If you expand to a 2nd node, you might get a unpleasant
surprise with a drop in performance, because you will be adding
network latency that decreases your
iops.
-----Original Message-----
From: Hector Martin
[mailto:hector@xxxxxxxxxxxxxx]
Sent:
30 January 2019 19:43
To: ceph-users@xxxxxxxxxxxxxx
Subject:
CephFS performance vs. underlying storage
Hi list,
I'm
experimentally running single-host CephFS as as replacement for
"traditional"
filesystems.
My setup is 8×8TB HDDs using dm-crypt, with CephFS on a 5+2
EC pool. All
of the components are running on the same host
(mon/osd/mds/kernel
CephFS client). I've set the stripe_unit/object_size to a
relatively
high 80MB (up from the default 4MB). I figure I want individual
reads on
the disks to be several megabytes per object for good
sequential
performance, and since this is an EC pool 4MB objects would be
split
into 800kB chunks, which is clearly not ideal. With 80MB objects,
chunks
are 16MB, which sounds more like a healthy read size for
sequential
access (e.g. something like 10 IOPS per disk during seq
reads).
With this config, I get about 270MB/s sequential from CephFS. On
the
same disks, an ext4 on dm-crypt on dm-raid6 yields ~680MB/s. So it
seems
Ceph achieves less than half of the raw performance that the
underlying
storage is capable of (with similar RAID redundancy).
*
Obviously there will be some overhead with a stack as deep as
Ceph
compared to more traditional setups, but I'm wondering if there
are
improvements to be had here. While reading from CephFS I do not
have
significant CPU usage, so I don't think I'm CPU limited. Could the
issue
perhaps be latency through the stack / lack of read-ahead? Reading
two
files in parallel doesn't really get me more than 300MB/s in total,
so
parallelism doesn't seem to help much.
I'm curious as to whether
there are any knobs I can play with to try to
improve performance, or whether
this level of overhead is pretty much
inherent to Ceph. Even though this is
an unusual single-host setup, I
imagine proper clusters might also have
similar results when comparing
raw storage performance.
* Ceph has a
slight disadvantage here because its chunk of the drives is
logically after
the traditional RAID, and HDDs get slower towards higher
logical addresses,
but this should be on the order of a 15-20% hit at most.
--
Hector
Martin (hector@xxxxxxxxxxxxxx)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users
mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com