Re: RADOS Bench slow write speed

Alexandre DERUMIER <aderumier@xxxxxxxxx> · Mon, 20 Apr 2015 11:37:49 +0200 (CEST)

Hi,

for writes, ceph write twice to the disk, 1 for journal && 1 for datas.  (so half write bandwith)

and journal is writen with O_DSYNC

(you should test your disk with fio --sync=1 to compare).

That's why the recommandation is to use ssd for journal disks.

----- Mail original -----
De: "Pedro Miranda" <potter737@xxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Lundi 20 Avril 2015 11:25:38
Objet:  RADOS Bench slow write speed

Hi all!! 
I'm setting up a Ceph (version 0.80.6) cluster and I'mbenchmarking the infrastructure and Ceph itself. I've got 3 rack servers (Dell R630) each with it's own disks in enclosures. 
The cluster network bandwidth is of 10Gbps, the bandwidth between the RAID controller (Dell H830) and enclosure (MD1400) is of 48 Gbps and negotiated speed with each disk (configured as JBOD) is of 6 Gbps. 
We use 4 TB SAS disks, 7200 RPM with sustained bandwidth of 175 MB/s (confirmed with fio). 

I tried to simulate a Ceph pattern by writing/reading on a single disk with XFS, a great number of 4 MB files with on instance of fio. 

[writetest] 
ioengine=libaio 
directory=/var/local/xfs 
filesize=4m 
iodepth=256 
rw= write 
direct=1 
numjobs=1 
loops=1 
nrfiles=18000 
bs=4096k 

For the other tests (sequential reads, random writes, random reads), only the "rw" changes. 

I got the following bandwidth: 
Sequential write speed: 137 MB/s 
Sequential read speed: 144.5 MB/s 
Radom write speed: 134 MB/s 
Radom read speed: 144 MB/s 

Ok, it has an overhead associated with writing a great amount of 4 MB files. 

Now, I move on to bechmark with an OSD on top of the disk, so I've got the cluster with only 1 OSD (separate partitions for journal and data) and run rados bench. 

1 Thread: 
Writes: 27.7 MB/s 
Reads: 90 MB/s 
Random reads: 82 MB/s 

16 Threads (default): 
Writes: 37 MB/s 
Reads: 79.2 MB/s 
Random reads: 71.4 MB/s 

As you can see, writes are awfully slow. What I have notice is very high latencies: 

BQ_BEGIN
Total writes made: 16868 
Write size: 4194304 
Bandwidth (MB/sec): 37.449 
Stddev Bandwidth: 24.606 
Max bandwidth (MB/sec): 120 
Min bandwidth (MB/sec): 0 
Average Latency: 1.70862 
Stddev Latency: 1.13098 
Max latency: 6.33929 
Min latency: 0.20767 
BQ_END

Is this throughput to be expected from 7.2K RPM disks with 4 TB size? Or is anything in the Ceph configuration that might be changed to decrease the observed latency? Any sugestions? 

Appreciated, 
Pedro Miranda. 

_______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com