Hello Mark, See below my benchmarks results: -RADOS Bench with 4M block size write: # rados -p bench bench 300 write -t 32 --no-cleanup Maintaining 32 concurrent writes of 4194304 bytes for at least 300 seconds. 2012-11-19 21:35:01.722143min lat: 0.255396 max lat: 8.40212 avg lat: 1.14076 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 300 32 8414 8382 111.737 104 0.502774 1.14076 Total time run: 300.814954 Total writes made: 8414 Write size: 4194304 Bandwidth (MB/sec): 111.883 Stddev Bandwidth: 7.4274 Max bandwidth (MB/sec): 132 Min bandwidth (MB/sec): 56 Average Latency: 1.14352 Stddev Latency: 1.18344 Max latency: 8.40212 Min latency: 0.255396 -RADOS Bench with 4M block size seq: # rados -p bench bench 300 seq -t 32 --no-cleanup 2012-11-19 21:40:35.128728min lat: 0.224415 max lat: 6.14781 avg lat: 1.1591 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 300 31 8284 8253 110.021 108 1.87698 1.1591 Total time run: 300.931287 Total reads made: 8285 Read size: 4194304 Bandwidth (MB/sec): 110.125 Average Latency: 1.16177 Max latency: 6.14781 Min latency: 0.224415 -RBD FIO test, as you recommend I used 4M block size for seq tests for the first test. See below the fio configuration file used: [global] ioengine=libaio iodepth=4 size=1G runtime=60 filename=/dev/rbd1 [seq-read] rw=read bs=4M stonewall direct=1 [rand-read] rw=randread bs=4K stonewall direct=1 [seq-write] rw=write bs=4M stonewall direct=1 [rand-write] rw=randwrite bs=4K stonewall direct=1 Results iodepth 4 and 1G file: # fio rbd-bench.fio seq-read: (g=0): rw=read, bs=4M-4M/4M-4M, ioengine=libaio, iodepth=4 rand-read: (g=1): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=4 seq-write: (g=2): rw=write, bs=4M-4M/4M-4M, ioengine=libaio, iodepth=4 rand-write: (g=3): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=4 fio 1.59 Starting 4 processes Jobs: 1 (f=1): [___w] [64.2% done] [0K/2588K /s] [0 /632 iops] [eta 01m:18s] seq-read: (groupid=0, jobs=1): err= 0: pid=10586 read : io=1024.0MB, bw=110656KB/s, iops=27 , runt= 9476msec slat (usec): min=250 , max=1812 , avg=389.88, stdev=178.26 clat (msec): min=37 , max=615 , avg=147.42, stdev=102.77 lat (msec): min=38 , max=615 , avg=147.81, stdev=102.77 bw (KB/s) : min=84216, max=122390, per=99.60%, avg=110208.50, stdev=9149.98 cpu : usr=0.00%, sys=0.97%, ctx=1552, majf=0, minf=4119 IO depths : 1=0.4%, 2=0.8%, 4=98.8%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=256/0/0, short=0/0/0 lat (msec): 50=4.69%, 100=31.64%, 250=50.78%, 500=11.72%, 750=1.17% rand-read: (groupid=1, jobs=1): err= 0: pid=10868 read : io=161972KB, bw=2697.1KB/s, iops=674 , runt= 60036msec slat (usec): min=12 , max=346 , avg=39.89, stdev=10.04 clat (usec): min=570 , max=50215 , avg=5885.64, stdev=12119.46 lat (usec): min=601 , max=50258 , avg=5926.07, stdev=12117.44 bw (KB/s) : min= 2015, max= 3356, per=100.15%, avg=2701.03, stdev=276.41 cpu : usr=0.51%, sys=2.14%, ctx=66054, majf=0, minf=26 IO depths : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=40493/0/0, short=0/0/0 lat (usec): 750=3.69%, 1000=60.21% lat (msec): 2=19.37%, 4=1.49%, 10=1.30%, 20=0.30%, 50=13.64% lat (msec): 100=0.01% seq-write: (groupid=2, jobs=1): err= 0: pid=12619 write: io=1024.0MB, bw=112412KB/s, iops=27 , runt= 9328msec slat (usec): min=510 , max=1683 , avg=820.63, stdev=150.32 clat (msec): min=47 , max=744 , avg=144.21, stdev=73.99 lat (msec): min=48 , max=744 , avg=145.03, stdev=74.00 bw (KB/s) : min=103193, max=124830, per=100.87%, avg=113390.71, stdev=6178.93 cpu : usr=1.46%, sys=0.81%, ctx=267, majf=0, minf=21 IO depths : 1=0.4%, 2=0.8%, 4=98.8%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=0/256/0, short=0/0/0 lat (msec): 50=0.78%, 100=17.97%, 250=75.39%, 500=5.08%, 750=0.78% rand-write: (groupid=3, jobs=1): err= 0: pid=12934 write: io=125352KB, bw=2088.1KB/s, iops=522 , runt= 60007msec slat (usec): min=13 , max=388 , avg=50.47, stdev=13.73 clat (msec): min=1 , max=1271 , avg= 7.60, stdev=22.16 lat (msec): min=1 , max=1271 , avg= 7.66, stdev=22.16 bw (KB/s) : min= 155, max= 2944, per=102.13%, avg=2132.45, stdev=563.22 cpu : usr=0.45%, sys=1.87%, ctx=51594, majf=0, minf=19 IO depths : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued r/w/d: total=0/31338/0, short=0/0/0 lat (msec): 2=5.84%, 4=59.28%, 10=12.47%, 20=15.83%, 50=5.72% lat (msec): 100=0.30%, 250=0.44%, 500=0.07%, 750=0.02%, 1000=0.02% lat (msec): 2000=0.01% Run status group 0 (all jobs): READ: io=1024.0MB, aggrb=110655KB/s, minb=113311KB/s, maxb=113311KB/s, mint=9476msec, maxt=9476msec Run status group 1 (all jobs): READ: io=161972KB, aggrb=2697KB/s, minb=2762KB/s, maxb=2762KB/s, mint=60036msec, maxt=60036msec Run status group 2 (all jobs): WRITE: io=1024.0MB, aggrb=112411KB/s, minb=115109KB/s, maxb=115109KB/s, mint=9328msec, maxt=9328msec Run status group 3 (all jobs): WRITE: io=125352KB, aggrb=2088KB/s, minb=2139KB/s, maxb=2139KB/s, mint=60007msec, maxt=60007msec Disk stats (read/write): rbd1: ios=42707/33325, merge=0/0, ticks=439568/438892, in_queue=878724, util=99.57% With an iodepth of 64 and 10G file: seq-read: (g=0): rw=read, bs=4M-4M/4M-4M, ioengine=libaio, iodepth=64 rand-read: (g=1): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64 seq-write: (g=2): rw=write, bs=4M-4M/4M-4M, ioengine=libaio, iodepth=64 rand-write: (g=3): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64 fio 1.59 Starting 4 processes Jobs: 1 (f=1): [___w] [58.1% done] [0K/0K /s] [0 /0 iops] [eta 02m:57s] seq-read: (groupid=0, jobs=1): err= 0: pid=25257 read : io=6564.0MB, bw=110816KB/s, iops=27 , runt= 60655msec slat (usec): min=204 , max=287661 , avg=36605.14, stdev=63984.12 clat (msec): min=573 , max=5910 , avg=2305.13, stdev=715.03 lat (msec): min=712 , max=5938 , avg=2341.74, stdev=716.44 bw (KB/s) : min= 0, max=116819, per=61.34%, avg=67975.54, stdev=54174.75 cpu : usr=0.00%, sys=1.08%, ctx=10644, majf=0, minf=65559 IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.5%, 16=1.0%, 32=2.0%, >=64=96.2% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued r/w/d: total=1641/0/0, short=0/0/0 lat (msec): 750=0.30%, 1000=0.37%, 2000=41.13%, >=2000=58.20% rand-read: (groupid=1, jobs=1): err= 0: pid=27045 read : io=6170.6MB, bw=105242KB/s, iops=26310 , runt= 60039msec slat (usec): min=9 , max=2456 , avg=26.68, stdev= 9.23 clat (usec): min=501 , max=42630 , avg=2403.11, stdev=1136.11 lat (usec): min=544 , max=42654 , avg=2430.20, stdev=1135.87 bw (KB/s) : min= 0, max=107376, per=65.94%, avg=69395.09, stdev=50034.68 cpu : usr=9.62%, sys=53.77%, ctx=1804080, majf=0, minf=86 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued r/w/d: total=1579662/0/0, short=0/0/0 lat (usec): 750=0.34%, 1000=2.07% lat (msec): 2=38.96%, 4=51.17%, 10=7.37%, 20=0.08%, 50=0.01% seq-write: (groupid=2, jobs=1): err= 0: pid=28845 write: io=6776.0MB, bw=114538KB/s, iops=27 , runt= 60579msec slat (usec): min=419 , max=237721 , avg=35415.33, stdev=60635.70 clat (msec): min=572 , max=6468 , avg=2229.49, stdev=935.01 lat (msec): min=695 , max=6469 , avg=2264.91, stdev=931.13 bw (KB/s) : min= 0, max=136533, per=61.73%, avg=70705.08, stdev=56037.47 cpu : usr=1.96%, sys=0.75%, ctx=623, majf=0, minf=21 IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.5%, 16=0.9%, 32=1.9%, >=64=96.3% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued r/w/d: total=0/1694/0, short=0/0/0 lat (msec): 750=0.30%, 1000=0.47%, 2000=63.64%, >=2000=35.60% rand-write: (groupid=3, jobs=1): err= 0: pid=30722 write: io=203724KB, bw=3250.5KB/s, iops=812 , runt= 62675msec slat (usec): min=12 , max=589 , avg=50.66, stdev=12.44 clat (msec): min=1 , max=3603 , avg=78.65, stdev=242.01 lat (msec): min=1 , max=3603 , avg=78.70, stdev=242.01 bw (KB/s) : min= 0, max= 7001, per=70.93%, avg=2305.36, stdev=2413.85 cpu : usr=0.59%, sys=2.66%, ctx=81900, majf=0, minf=19 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.9% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued r/w/d: total=0/50931/0, short=0/0/0 lat (msec): 2=9.94%, 4=34.46%, 10=6.95%, 20=11.34%, 50=14.86% lat (msec): 100=7.69%, 250=7.06%, 500=3.90%, 750=1.59%, 1000=0.73% lat (msec): 2000=1.15%, >=2000=0.33% Run status group 0 (all jobs): READ: io=6564.0MB, aggrb=110815KB/s, minb=113475KB/s, maxb=113475KB/s, mint=60655msec, maxt=60655msec Run status group 1 (all jobs): READ: io=6170.6MB, aggrb=105242KB/s, minb=107768KB/s, maxb=107768KB/s, mint=60039msec, maxt=60039msec Run status group 2 (all jobs): WRITE: io=6776.0MB, aggrb=114538KB/s, minb=117287KB/s, maxb=117287KB/s, mint=60579msec, maxt=60579msec Run status group 3 (all jobs): WRITE: io=203724KB, aggrb=3250KB/s, minb=3328KB/s, maxb=3328KB/s, mint=62675msec, maxt=62675msec Disk stats (read/write): rbd1: ios=1592951/64482, merge=0/0, ticks=12415028/12528984, in_queue=24945216, util=99.68% Thank you in advance. On Mon, Nov 19, 2012 at 7:11 PM, Mark Kampe <mark.kampe@xxxxxxxxxxx> wrote: > > > On 11/19/2012 10:03 AM, Sébastien Han wrote: > >> The original benchmark has been performed with 4M block size. And as >> you can see I still get more IOPS with rand than seq... I just tried >> with 4M without direct I/O, still the same. I can print fio results if >> it's needed. > > > Yes, please send me your 4M random and sequential write results > both radosbench (or better smalliobench, which is more directly > comparable) and fio to an RBD. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html