On Mon, Sep 4, 2017 at 4:27 PM, <c.monty@xxxxxx> wrote:
Hello!
I'm validating IO performance of CephFS vs. NFS.
Therefore I have mounted the relevant filesystems on the same client.
Then I start fio with the following parameters:
action = "" randrw
blocksize = 4k 128k 8m
rwmixreadread = 70 50 30
32 jobs run in parallel
The NFS share is striping over 5 virtual disks with a 4+1 RAID5 configuration; each disk has ~8TB.
The CephFS is configured on 2 MDS servers (1 up:active, 1 up:standby); each MDS has 47 OSDs where 1 OSD is represented by single 8TB disk.
(The disks of RAID5 and OSD are identical.)
So that's a 2 node cluster? I'm assuming filestore OSDs with journals on the OSDs, 2x or 3x replication. The NFS server on local storage is going to perform much better as you've found.
What I can see is that the IO performance of blocksize 8m is slightly better with CephFS, but worse (by factor 4-10) with blocksize 4k / 128k.
Not surprising. You can
Here the stats for randrw with mix 30:
ld9930:/home # tail -n 3 ld9930-fio-test-cephfs-randrw30-8m
Run status group 0 (all jobs):
READ: bw=335MiB/s (351MB/s), 335MiB/s-335MiB/s (351MB/s-351MB/s), io=19.7GiB (21.2GB), run=60099-60099msec
WRITE: bw=753MiB/s (789MB/s), 753MiB/s-753MiB/s (789MB/s-789MB/s), io=44.2GiB (47.5GB), run=60099-60099msec
ld9930:/home # tail -n 3 ld9930-fio-test-nfs-randrw30-8m
Run status group 0 (all jobs):
READ: bw=324MiB/s (340MB/s), 324MiB/s-324MiB/s (340MB/s-340MB/s), io=19.0GiB (20.5GB), run=60052-60052msec
WRITE: bw=725MiB/s (760MB/s), 725MiB/s-725MiB/s (760MB/s-760MB/s), io=42.6GiB (45.7GB), run=60052-60052msec
ld9930:/home # tail -n 3 ld9930-fio-test-nfs-randrw30-128k
Run status group 0 (all jobs):
READ: bw=287MiB/s (301MB/s), 287MiB/s-287MiB/s (301MB/s-301MB/s), io=16.9GiB (18.7GB), run=60006-60006msec
WRITE: bw=667MiB/s (700MB/s), 667MiB/s-667MiB/s (700MB/s-700MB/s), io=39.1GiB (41.1GB), run=60006-60006msec
ld9930:/home # tail -n 3 ld9930-fio-test-cephfs-randrw30-128k
Run status group 0 (all jobs):
READ: bw=69.2MiB/s (72.6MB/s), 69.2MiB/s-69.2MiB/s (72.6MB/s-72.6MB/s), io=4172MiB (4375MB), run=60310-60310msec
WRITE: bw=161MiB/s (169MB/s), 161MiB/s-161MiB/s (169MB/s-169MB/s), io=9732MiB (10.3GB), run=60310-60310msec
ld9930:/home # tail -n 3 ld9930-fio-test-cephfs-randrw30-4k
Run status group 0 (all jobs):
READ: bw=5631KiB/s (5766kB/s), 5631KiB/s-5631KiB/s (5766kB/s-5766kB/s), io=330MiB (346MB), run=60043-60043msec
WRITE: bw=12.8MiB/s (13.4MB/s), 12.8MiB/s-12.8MiB/s (13.4MB/s-13.4MB/s), io=767MiB (804MB), run=60043-60043msec
ld9930:/home # tail -n 3 ld9930-fio-test-nfs-randrw30-4k
Run status group 0 (all jobs):
READ: bw=77.2MiB/s (80.8MB/s), 77.2MiB/s-77.2MiB/s (80.8MB/s-80.8MB/s), io=4621MiB (4846MB), run=60004-60004msec
WRITE: bw=180MiB/s (188MB/s), 180MiB/s-180MiB/s (188MB/s-188MB/s), io=10.6GiB (11.4GB), run=60004-60004msec
This implies that for good IO performance only data with blocksize > 128k (I guess > 1M) should be used.
Can anybody confirm this?
THX
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com