Re: krbd vs librbd performance with qemu

Jason Dillaman <jdillama@xxxxxxxxxx> · Wed, 18 Jul 2018 11:12:22 -0400

On Wed, Jul 18, 2018 at 10:55 AM Nikola Ciprich <nikola.ciprich@xxxxxxxxxxx> wrote:
Hi,

historically I've found many discussions about this topic in

last few years, but it seems  to me to be still a bit unresolved

so I'd like to open the question again..

In all flash deployments, under 12.2.5 luminous and qemu 12.2.0

using lbirbd, I'm getting much worse results regarding IOPS then

with KRBD and direct block device access..

I'm testing on the same 100GB RBD volume, notable ceph settings:

client rbd cache disabled

osd_enable_op_tracker = False

osd_op_num_shards = 64

osd_op_num_threads_per_shard = 1

osds are running bluestore, 2 replicas (it's just for testing)

when I run FIO using librbd directly, I'm getting ~160k reads/s

and ~60k writes/s which is not that bad.

however when I run fio on block device under VM (qemu using librbd),

I'm getting only 60/40K op/s which is a huge loss.. 

when I use VM with block access to krbd mapped device, numbers

are much better, I'm getting something like 115/40K op/s which

is not ideal, but still much better.. tried many optimisations

and configuration variants (multiple queues, threads vs native aio

etc), but krbd still performs much much better..

My question is whether this is expected, or should both access methods

give more similar results? If possible, I'd like  to stick to librbd

(especially because krbd still lacks layering support, but there are

more reasons)

Just to clarify: modern / rebased krbd block drivers definitely support layering. The only missing features right now are object-map/fast-diff, deep-flatten, and journaling (for RBD mirroring). 

interesting is, that when I compare fio direct ceph access, librbd performs

better then KRBD, but  this doesn't concern me that much..

another question, during the tests, I noticed that enabling exclusive lock

feature degrades write iops a lot as well, is this expected? (the performance

falls to someting like 50%)

If you are running multiple fio jobs against the same image (or have the krbd device mapped to multiple hosts w/ active IO), then I would expect a huge performance hit since the lock needs to be transitioned between clients.

I'm doing the tests on small 2 node cluster, VMS are running directly on ceph nodes,

all is centos 7 with 4.14 kernel. (I know it's not recommended to run VMs directly

on ceph nodes, but for small deployments it's necessary for us)

if I could provide more details, I'll be happy to do so

BR

nik

-- 

-------------------------------------

Ing. Nikola CIPRICH

LinuxBox.cz, s.r.o.

28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214

fax:    +420 596 621 273

mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656

email servis: servis@xxxxxxxxxxx

-------------------------------------

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com