Hello! I'm validating IO performance of CephFS vs. NFS. Therefore I have mounted the relevant filesystems on the same client. Then I start fio with the following parameters: action = randwrite randrw blocksize = 4k 128k 8m rwmixreadread = 70 50 30 32 jobs run in parallel The NFS share is striping over 5 virtual disks with a 4+1 RAID5 configuration; each disk has ~8TB. The CephFS is configured on 2 MDS servers (1 up:active, 1 up:standby); each MDS has 47 OSDs where 1 OSD is represented by single 8TB disk. (The disks of RAID5 and OSD are identical.) What I can see is that the IO performance of blocksize 8m is slightly better with CephFS, but worse (by factor 4-10) with blocksize 4k / 128k. Here the stats for randrw with mix 30: ld9930:/home # tail -n 3 ld9930-fio-test-cephfs-randrw30-8m Run status group 0 (all jobs): READ: bw=335MiB/s (351MB/s), 335MiB/s-335MiB/s (351MB/s-351MB/s), io=19.7GiB (21.2GB), run=60099-60099msec WRITE: bw=753MiB/s (789MB/s), 753MiB/s-753MiB/s (789MB/s-789MB/s), io=44.2GiB (47.5GB), run=60099-60099msec ld9930:/home # tail -n 3 ld9930-fio-test-nfs-randrw30-8m Run status group 0 (all jobs): READ: bw=324MiB/s (340MB/s), 324MiB/s-324MiB/s (340MB/s-340MB/s), io=19.0GiB (20.5GB), run=60052-60052msec WRITE: bw=725MiB/s (760MB/s), 725MiB/s-725MiB/s (760MB/s-760MB/s), io=42.6GiB (45.7GB), run=60052-60052msec ld9930:/home # tail -n 3 ld9930-fio-test-nfs-randrw30-128k Run status group 0 (all jobs): READ: bw=287MiB/s (301MB/s), 287MiB/s-287MiB/s (301MB/s-301MB/s), io=16.9GiB (18.7GB), run=60006-60006msec WRITE: bw=667MiB/s (700MB/s), 667MiB/s-667MiB/s (700MB/s-700MB/s), io=39.1GiB (41.1GB), run=60006-60006msec ld9930:/home # tail -n 3 ld9930-fio-test-cephfs-randrw30-128k Run status group 0 (all jobs): READ: bw=69.2MiB/s (72.6MB/s), 69.2MiB/s-69.2MiB/s (72.6MB/s-72.6MB/s), io=4172MiB (4375MB), run=60310-60310msec WRITE: bw=161MiB/s (169MB/s), 161MiB/s-161MiB/s (169MB/s-169MB/s), io=9732MiB (10.3GB), run=60310-60310msec ld9930:/home # tail -n 3 ld9930-fio-test-cephfs-randrw30-4k Run status group 0 (all jobs): READ: bw=5631KiB/s (5766kB/s), 5631KiB/s-5631KiB/s (5766kB/s-5766kB/s), io=330MiB (346MB), run=60043-60043msec WRITE: bw=12.8MiB/s (13.4MB/s), 12.8MiB/s-12.8MiB/s (13.4MB/s-13.4MB/s), io=767MiB (804MB), run=60043-60043msec ld9930:/home # tail -n 3 ld9930-fio-test-nfs-randrw30-4k Run status group 0 (all jobs): READ: bw=77.2MiB/s (80.8MB/s), 77.2MiB/s-77.2MiB/s (80.8MB/s-80.8MB/s), io=4621MiB (4846MB), run=60004-60004msec WRITE: bw=180MiB/s (188MB/s), 180MiB/s-180MiB/s (188MB/s-188MB/s), io=10.6GiB (11.4GB), run=60004-60004msec This implies that for good IO performance only data with blocksize > 128k (I guess > 1M) should be used. Can anybody confirm this? THX _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com