Re: bad perf for librbd vs krbd using FIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Christian,

Will try direct=1 and block size, cheers.

Re: librbd, I'm not using VM yet, only using FIO with the RBD ioengine (ioengine=rbd in fio job file). Actually, I was seeing some unexpected IO on a VM which is what prompted me to start doing tests on the hypervisor host. 

Raf

On 11 September 2015 at 13:42, Christian Balzer <chibi@xxxxxxx> wrote:

Hello,

On Fri, 11 Sep 2015 13:24:24 +1000 Rafael Lopez wrote:

> Hi all,
>
> I am seeing a big discrepancy between librbd and kRBD/ext4 performance
> using FIO with single RBD image. RBD images are coming from same RBD
> pool, same size and settings for both. The librbd results are quite bad
> by comparison, and in addition if I scale up the kRBD FIO job with more
> jobs/threads it increases up to 3-4x results below, but librbd doesn't
> seem to scale much at all. I figured that it should be close to the kRBD
> result for a single job/thread before parallelism comes into play
> though. RBD cache settings are all default.
>
librbd as in FUSE or to KVM client VM?

RBD cache settings only influence librbd, the kernel will use all of the
available memory for page cache.

And this what you're probably seeing, with the kernel RBD being so much
faster.

Anyway, a good comparison and idea of what your cluster can do would be
firstly with a blocksize of 4KB (smaller total size of course) and
direct=1.

Christian

> I can see some obvious differences in FIO output, but not being well
> versed with FIO I'm not sure what to make of it or where to start
> diagnosing the discrepancy. Hunted around but haven't found anything
> useful, any suggestions/insights would be appreciated.
>
> RBD cache settings:
> [root@rcmktdc1r72-09-ac rafaell]# ceph --admin-daemon
> /var/run/ceph/ceph-osd.659.asok config show | grep rbd_cache
>     "rbd_cache": "true",
>     "rbd_cache_writethrough_until_flush": "true",
>     "rbd_cache_size": "33554432",
>     "rbd_cache_max_dirty": "25165824",
>     "rbd_cache_target_dirty": "16777216",
>     "rbd_cache_max_dirty_age": "1",
>     "rbd_cache_max_dirty_object": "0",
>     "rbd_cache_block_writes_upfront": "false",
> [root@rcmktdc1r72-09-ac rafaell]#
>
> This is the FIO job file for the kRBD job:
>
> [root@rcprsdc1r72-01-ac rafaell]# cat ext4_test
> ; -- start job file --
> [global]
> rw=rw
> size=100g
> filename=/mnt/rbd/fio_test_file_ext4
> rwmixread=0
> rwmixwrite=100
> percentage_random=0
> bs=1024k
> direct=0
> iodepth=16
> thread=1
> numjobs=1
> [job1]
> ; -- end job file --
>
> [root@rcprsdc1r72-01-ac rafaell]#
>
> This is the FIO job file for the librbd job:
>
> [root@rcprsdc1r72-01-ac rafaell]# cat fio_rbd_test
> ; -- start job file --
> [global]
> rw=rw
> size=100g
> rwmixread=0
> rwmixwrite=100
> percentage_random=0
> bs=1024k
> direct=0
> iodepth=16
> thread=1
> numjobs=1
> ioengine=rbd
> rbdname=nas1-rds-stg31
> pool=rbd
> [job1]
> ; -- end job file --
>
>
> Here are the results:
>
> [root@rcprsdc1r72-01-ac rafaell]# fio ext4_test
> job1: (g=0): rw=rw, bs=1M-1M/1M-1M/1M-1M, ioengine=sync, iodepth=16
> fio-2.2.8
> Starting 1 thread
> job1: Laying out IO file(s) (1 file(s) / 102400MB)
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/321.7MB/0KB /s] [0/321/0 iops]
> [eta 00m:00s]
> job1: (groupid=0, jobs=1): err= 0: pid=37981: Fri Sep 11 12:33:13 2015
>   write: io=102400MB, bw=399741KB/s, iops=390, runt=262314msec
>     clat (usec): min=411, max=574082, avg=2492.91, stdev=7316.96
>      lat (usec): min=418, max=574113, avg=2520.12, stdev=7318.53
>     clat percentiles (usec):
>      |  1.00th=[  446],  5.00th=[  458], 10.00th=[  474],
> 20.00th=[  510], | 30.00th=[ 1064], 40.00th=[ 1096], 50.00th=[ 1160],
> 60.00th=[ 1320], | 70.00th=[ 1592], 80.00th=[ 2448], 90.00th=[ 7712],
> 95.00th=[ 7904], | 99.00th=[11072], 99.50th=[11712], 99.90th=[13120],
> 99.95th=[73216], | 99.99th=[464896]
>     bw (KB  /s): min=  264, max=2156544, per=100.00%, avg=412986.27,
> stdev=375092.66
>     lat (usec) : 500=18.68%, 750=7.43%, 1000=2.11%
>     lat (msec) : 2=48.89%, 4=4.35%, 10=16.79%, 20=1.67%, 50=0.03%
>     lat (msec) : 100=0.03%, 250=0.02%, 500=0.01%, 750=0.01%
>   cpu          : usr=1.24%, sys=45.38%, ctx=19298, majf=0, minf=974
>   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      issued    : total=r=0/w=102400/d=0, short=r=0/w=0/d=0,
> drop=r=0/w=0/d=0 latency   : target=0, window=0, percentile=100.00%,
> depth=16
>
> Run status group 0 (all jobs):
>   WRITE: io=102400MB, aggrb=399740KB/s, minb=399740KB/s, maxb=399740KB/s,
> mint=262314msec, maxt=262314msec
>
> Disk stats (read/write):
>   rbd0: ios=0/150890, merge=0/49, ticks=0/36117700, in_queue=36145277,
> util=96.97%
> [root@rcprsdc1r72-01-ac rafaell]#
>
> [root@rcprsdc1r72-01-ac rafaell]# fio fio_rbd_test
> job1: (g=0): rw=rw, bs=1M-1M/1M-1M/1M-1M, ioengine=rbd, iodepth=16
> fio-2.2.8
> Starting 1 thread
> rbd engine: RBD version: 0.1.9
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/65405KB/0KB /s] [0/63/0 iops]
> [eta 00m:00s]
> job1: (groupid=0, jobs=1): err= 0: pid=43960: Fri Sep 11 12:54:25 2015
>   write: io=102400MB, bw=121882KB/s, iops=119, runt=860318msec
>     slat (usec): min=355, max=7300, avg=908.97, stdev=361.02
>     clat (msec): min=11, max=1468, avg=129.59, stdev=130.68
>      lat (msec): min=12, max=1468, avg=130.50, stdev=130.69
>     clat percentiles (msec):
>      |  1.00th=[   21],  5.00th=[   26], 10.00th=[   29],
> 20.00th=[   34], | 30.00th=[   37], 40.00th=[   40], 50.00th=[   44],
> 60.00th=[   63], | 70.00th=[  233], 80.00th=[  241], 90.00th=[  269],
> 95.00th=[  367], | 99.00th=[  553], 99.50th=[  652], 99.90th=[  832],
> 99.95th=[  848], | 99.99th=[ 1369]
>     bw (KB  /s): min=20363, max=248543, per=100.00%, avg=124381.19,
> stdev=42313.29
>     lat (msec) : 20=0.95%, 50=55.27%, 100=5.55%, 250=24.83%, 500=12.28%
>     lat (msec) : 750=0.89%, 1000=0.21%, 2000=0.01%
>   cpu          : usr=9.58%, sys=1.15%, ctx=23883, majf=0, minf=2751023
>   IO depths    : 1=1.2%, 2=3.0%, 4=9.7%, 8=68.3%, 16=17.8%, 32=0.0%,
> >=64=0.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=92.5%, 8=4.3%, 16=3.2%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      issued    : total=r=0/w=102400/d=0, short=r=0/w=0/d=0,
> drop=r=0/w=0/d=0 latency   : target=0, window=0, percentile=100.00%,
> depth=16
>
> Run status group 0 (all jobs):
>   WRITE: io=102400MB, aggrb=121882KB/s, minb=121882KB/s, maxb=121882KB/s,
> mint=860318msec, maxt=860318msec
>
> Disk stats (read/write):
>     dm-1: ios=0/2072, merge=0/0, ticks=0/233, in_queue=233, util=0.01%,
> aggrios=1/2249, aggrmerge=7/559, aggrticks=9/254, aggrin_queue=261,
> aggrutil=0.01%
>   sda: ios=1/2249, merge=7/559, ticks=9/254, in_queue=261, util=0.01%
> [root@rcprsdc1r72-01-ac rafaell]#
>
> Cheers,
> Raf
>
>


--
Christian Balzer        Network/Systems Engineer
chibi@xxxxxxx           Global OnLine Japan/Fusion Communications
http://www.gol.com/



--
Rafael Lopez
Data Storage Administrator
Servers & Storage (eSolutions)
+61 3 990 59118

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux