Re: Ceph RBD object-map and discard in VM

Vaibhav Bhembre <vaibhav@xxxxxxxxxxxxxxxx> · Thu, 14 Jul 2016 14:55:02 -0400

We have been observing this similar behavior. Usually it is the case where we create a new rbd image, expose it into the guest and perform any operation that issues discard to the device.

A typical command that's first run on a given device is mkfs, usually with discard on.

# time mkfs.xfs -s size=4096 -f /dev/sda
meta-data=""               isize=256    agcount=4, agsize=6553600 blks
         =                       sectsz=4096  attr=2, projid32bit=0
data     =                       bsize=4096   blocks=26214400, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=12800, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

real	9m10.882s
user	0m0.000s
sys	0m0.012s

When we issue this same command with object-map feature disabled on the image it completes much faster.

# time mkfs.xfs -s size=4096 -f /dev/sda
meta-data=""               isize=256    agcount=4, agsize=6553600 blks
         =                       sectsz=4096  attr=2, projid32bit=0
data     =                       bsize=4096   blocks=26214400, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=12800, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

real	0m1.780s
user	0m0.000s
sys	0m0.012s

Also from what I am seeing the slowness seems to be proportional to the size of the image rather than the amount of data written into it. Issuing mkfs without discard doesn't reproduce this issue. The above values were for 100G rbd image. The 250G takes slightly more than twice the time taken for 100G one.

# time mkfs.xfs -s size=4096 -f /dev/sda
meta-data=""               isize=256    agcount=4, agsize=16384000 blks
         =                       sectsz=4096  attr=2, projid32bit=0
data     =                       bsize=4096   blocks=65536000, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=32000, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

real	22m58.076s
user	0m0.000s
sys	0m0.024s

Let me know if you need any more information regarding this. We would like to enable object-map (and fast-diff) on our images once this gets resolved.

On Wed, Jun 22, 2016 at 5:39 PM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
I'm not sure why I never received the original list email, so I

apologize for the delay. Is /dev/sda1, from your example, fresh with

no data to actually discard or does it actually have lots of data to

discard?

Thanks,

On Wed, Jun 22, 2016 at 1:56 PM, Brian Andrus <bandrus@xxxxxxxxxx> wrote:

> I've created a downstream bug for this same issue.

>

> https://bugzilla.redhat.com/show_bug.cgi?id=1349116

>

> On Wed, Jun 15, 2016 at 6:23 AM, <list@xxxxxxxxxxxxxxx> wrote:

>>

>> Hello guys,

>>

>> We are currently testing Ceph Jewel with object-map feature enabled:

>>

>> rbd image 'disk-22920':

>>         size 102400 MB in 25600 objects

>>         order 22 (4096 kB objects)

>>         block_name_prefix: rbd_data.7cfa2238e1f29

>>         format: 2

>>         features: layering, exclusive-lock, object-map, fast-diff,

>> deep-flatten

>>         flags:

>>

>> We use this RBD as disk for a kvm virtual machine with virtio-scsi and

>> discard=unmap. We noticed the following paremeters in /sys/block:

>>

>> # cat /sys/block/sda/queue/discard_*

>> 4096

>> 1073741824

>> 0 <- discard_zeroes_data

>>

>> While trying to do a mkfs.ext4 on the disk in VM we noticed a low

>> performance with using discard.

>>

>> mkfs.ext4 -E nodiscard /dev/sda1 - tooks 5 seconds to complete

>> mkfs.ext4 -E discard /dev/sda1 - tooks around 3 monutes

>>

>> When disabling the object-map the mkfs with discard tooks just 5 seconds.

>>

>> Do you have any idea what might cause this issue?

>>

>> Kernel: 4.2.0-35-generic #40~14.04.1-Ubuntu

>> Ceph: 10.2.0

>> Libvirt: 1.3.1

>> QEMU: 2.5.0

>>

>> Thanks!

>>

>> Best regards,

>> Jonas

>> _______________________________________________

>> ceph-users mailing list

>> ceph-users@xxxxxxxxxxxxxx

>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

>

>

>

> --

> Brian Andrus

> Red Hat, Inc.

> Storage Consultant, Global Storage Practice

> Mobile +1 (530) 903-8487

>

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

--

Jason

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com