Hi all, I have some problem after my RBD performance test
Setup: Linux kernel: 3.6.11 OS: Ubuntu 12.04 RAID card: LSI MegaRAID SAS 9260-4i For every HDD: RAID0, Write Policy: Write Back with BBU, Read Policy: ReadAhead, IO Policy: Direct
Storage server number : 1 Storage server : 8 * HDD (each storage server has 8 osd, 7200 rpm, 2T)
4 * SSD (2 osd use 1 SSD as journal, the SSD divided into two partition sdx1, sdx2) Ceph version : 0.56.4 Replicas : 2 Monitor number:1 The write speed of HDD: # dd if=/dev/zero of=/dev/sdd bs=1024k count=10000 oflag=direct 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 69.3961 s, 151 MB/s The write speed of SSD: # dd if=/dev/zero of=/dev/sdb bs=1024k count=10000 oflag=direct 10000+0 records in 10000+0 records out 10485760000 bytes (10 GB) copied, 40.8671 s, 257 MB/s Then we use the RADOS benchmark and collectl to observed write performance #rados -p rbd bench 300 write -t 256 2013-04-05 14:31:13.732737min lat: 4.28207 max lat: 5.92085 avg lat: 4.78598 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 300 256 16043 15787 210.455 196 5.91 4.78598 Total time run: 300.588962 Total writes made: 16043 Write size: 4194304 Bandwidth (MB/sec): 213.488 Stddev Bandwidth: 40.6795 Max bandwidth (MB/sec): 288 Min bandwidth (MB/sec): 0 Average Latency: 4.75647 Stddev Latency: 0.37182 Max latency: 5.93183 Min latency: 0.590936 collectl on OSDs : #collectl --iosize -sCDN --dskfilt "sd(c|d|e|f|g|h|i|j)" # DISK STATISTICS (/sec) # <---------reads---------><---------writes---------><--------averages--------> Pct #Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util sdc 0 0 0 0 76848 563 460 167 167 12 26 0 42 sdd 0 0 0 0 45100 0 165 273 273 6 36 1 30 sde 0 0 0 0 73800 0 270 273 273 3 14 1 41 sdf 0 0 0 0 73800 0 270 273 273 17 64 1 33 sdg 0 0 0 0 41000 0 150 273 273 1 7 0 10 sdh 0 0 0 0 57400 0 210 273 273 4 20 1 27 sdi 0 0 0 0 36904 0 136 271 271 0 5 0 7 sdj 0 0 0 0 77776 0 285 273 272 28 87 1 48 collectl on SSDs : #collectl --iosize -sCDN --dskfilt "sd(b|k|l|m)" # DISK STATISTICS (/sec) # <---------reads---------><---------writes---------><--------averages--------> Pct #Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util sdb 0 0 0 0 115552 0 388 298 297 75 159 2 77 sdk 0 0 0 0 114592 0 389 295 294 12 33 0 38 sdl 0 0 0 0 100364 0 334 300 300 35 148 2 69 sdm 0 0 0 0 101644 0 345 295 294 245 583 2 99 <= almost 99% My question is: 1.The rados benchmark write is a random write right?
2.Why the bottleneck of write bandwidth occur at 213MB/s even if increased the concurrent (-t 512) ? It looks a bit worse, because the collectl show SSD's write throughput only has 100M~120M, but SSD should be able to 250MB/s 3.Why some SSD (sdm) [Util] almost 99% that means data written to osd not enough distributed ? 4.If bottleneck of write performance not SSD , What it should be write bottleneck ? 5.How can I improve write performance ? Thanks!! - Kelvin |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com