Hi, for writes, ceph write twice to the disk, 1 for journal && 1 for datas. (so half write bandwith) and journal is writen with O_DSYNC (you should test your disk with fio --sync=1 to compare). That's why the recommandation is to use ssd for journal disks. ----- Mail original ----- De: "Pedro Miranda" <potter737@xxxxxxxxx> À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Lundi 20 Avril 2015 11:25:38 Objet: RADOS Bench slow write speed Hi all!! I'm setting up a Ceph (version 0.80.6) cluster and I'mbenchmarking the infrastructure and Ceph itself. I've got 3 rack servers (Dell R630) each with it's own disks in enclosures. The cluster network bandwidth is of 10Gbps, the bandwidth between the RAID controller (Dell H830) and enclosure (MD1400) is of 48 Gbps and negotiated speed with each disk (configured as JBOD) is of 6 Gbps. We use 4 TB SAS disks, 7200 RPM with sustained bandwidth of 175 MB/s (confirmed with fio). I tried to simulate a Ceph pattern by writing/reading on a single disk with XFS, a great number of 4 MB files with on instance of fio. [writetest] ioengine=libaio directory=/var/local/xfs filesize=4m iodepth=256 rw= write direct=1 numjobs=1 loops=1 nrfiles=18000 bs=4096k For the other tests (sequential reads, random writes, random reads), only the "rw" changes. I got the following bandwidth: Sequential write speed: 137 MB/s Sequential read speed: 144.5 MB/s Radom write speed: 134 MB/s Radom read speed: 144 MB/s Ok, it has an overhead associated with writing a great amount of 4 MB files. Now, I move on to bechmark with an OSD on top of the disk, so I've got the cluster with only 1 OSD (separate partitions for journal and data) and run rados bench. 1 Thread: Writes: 27.7 MB/s Reads: 90 MB/s Random reads: 82 MB/s 16 Threads (default): Writes: 37 MB/s Reads: 79.2 MB/s Random reads: 71.4 MB/s As you can see, writes are awfully slow. What I have notice is very high latencies: BQ_BEGIN Total writes made: 16868 Write size: 4194304 Bandwidth (MB/sec): 37.449 Stddev Bandwidth: 24.606 Max bandwidth (MB/sec): 120 Min bandwidth (MB/sec): 0 Average Latency: 1.70862 Stddev Latency: 1.13098 Max latency: 6.33929 Min latency: 0.20767 BQ_END Is this throughput to be expected from 7.2K RPM disks with 4 TB size? Or is anything in the Ceph configuration that might be changed to decrease the observed latency? Any sugestions? Appreciated, Pedro Miranda. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com