Ceph all NVME Cluster sequential read speed

nick <nick@xxxxxxx> · Thu, 18 Aug 2016 10:15 +0200

Hi,
we are currently building a new ceph cluster with only NVME devices. One Node 
consists of 4x Intel P3600 2TB devices. Journal and filestore are on the same 
device. Each server has a 10 core CPU and uses 10 GBit ethernet NICs for 
public and ceph storage traffic. We are currently testing with 4 nodes overall. 

The cluster will be used only for virtual machine images via RBD. The pools 
are replicated (no EC).

Altough we are pretty happy with the single threaded write performance, the 
single threaded (iodepth=1) sequential read performance is a bit 
disappointing.

We are testing with fio and the rbd engine. After creating a 10GB RBD image, we 
use the following fio params to test:
"""
[global]
invalidate=1
ioengine=rbd
iodepth=1
ramp_time=2
size=2G
bs=4k
direct=1
buffered=0
"""

For a 4k workload we are reaching 1382 IOPS. Testing one NVME device directly 
(with psync engine and iodepth of 1) we can reach up to 84176 IOPS. This is a 
big difference.

I already read that the read_ahead setting might improve the situation, 
although this would only be true when using buffered reads, right?

Does anyone have other suggestions to get better serial read performance?

Cheers
Nick

-- 
Sebastian Nickel
Nine Internet Solutions AG, Albisriederstr. 243a, CH-8047 Zuerich
Tel +41 44 637 40 00 | Support +41 44 637 40 40 | www.nine.ch
Attachment:
signature.asc

Description: This is a digitally signed message part.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com