Re: RBD exclusive-lock and lqemu/librbd

Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx> · Fri, 2 Jun 2017 12:01:59 +0200



On 06/02/17 11:59, Peter Maloney wrote:
> On 06/01/17 17:12, koukou73gr wrote:
>> Hello list,
>>
>> Today I had to create a new image for a VM. This was the first time,
>> since our cluster was updated from Hammer to Jewel. So far I was just
>> copying an existing golden image and resized it as appropriate. But this
>> time I used rbd create.
>>
>> So I "rbd create"d a 2T image and attached it to an existing VM guest
>> with librbd using:
>>     <disk type='network' device='disk'>
>>       <driver name='qemu'/>
>>       <auth username='lalala'>
>>         <secret type='ceph' uuid='uiduiduid'/>
>>       </auth>
>>       <source protocol='rbd' name='libvirt-pool/srv-10-206-123-87.mails'/>
>>       <target dev='sdc' bus='scsi'/>
>>       <address type='drive' controller='0' bus='0' target='1' unit='0'/>
>>     </disk>
>>
>>
>> Booted the guest and tried to partition it the new drive from inside the
>> guest. That's it, parted (and anything else for that matter) that tried
>> to access the new disk would freeze. After 2 minutes the kernel would
>> start complaining:
>>
>> [  360.212391] INFO: task parted:1836 blocked for more than 120 seconds.
>> [  360.216001]       Not tainted 4.4.0-78-generic #99-Ubuntu
>> [  360.218663] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
> Is it easy for you to reproduce it? I had the same problem, and the same
> solution. But it isn't easy to reproduce... Jason Dillaman asked me for
> a gcore dump of a hung process but I wasn't able to get one. Can you do
> that, and when you reply, CC  Jason Dillaman <jdillama@xxxxxxxxxx> ?
I mean a hung qemu process on the vm host (the one that uses librbd).
And I guess that should be TO rather than CC.

>> After much headbanging, trial and error, I finaly thought of checking
>> the enabled rbd features of an existing image versus the new one.
>>
>> pre-existing: layering, stripping
>> new: layering, exclusive-lock, object-map, fast-diff, deep-flatten
>>
>> Disabling exclusive-lock (and fast-diff and object-map before that)
>> would allow the new image to become usable in the guest at last.
>>
>> This is with:
>>
>> ceph version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
>> qemu-img version 2.6.0 (qemu-kvm-ev-2.6.0-28.el7_3.3.1), Copyright (c)
>> 2004-2008 Fabrice Bellard
>>
>> on a host running:
>> CentOS Linux release 7.3.1611 (Core)
>> Linux host-10-206-123-184.physics.auth.gr 3.10.0-327.36.2.el7.x86_64 #1
>> SMP Mon Oct 10 23:08:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
>>
>> and a guest
>> DISTRIB_ID=Ubuntu
>> DISTRIB_RELEASE=16.04
>> DISTRIB_CODENAME=xenial
>> DISTRIB_DESCRIPTION="Ubuntu 16.04.2 LTS"
>> Linux srv-10-206-123-87.physics.auth.gr 4.4.0-78-generic #99-Ubuntu SMP
>> Thu Apr 27 15:29:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>>
>> I vagually remember references of problems when exclusive-lock was
>> enabled on rbd images but trying Google didn't reveal much to me.
>>
>> So what is it with exclusive lock? Why does it fail like this? Could you
>> please point me to some documentation on this behaviour?
>>
>> Thanks for any feedback.
>>
>> -K.
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-- 

--------------------------------------------
Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300
Fax: +49 4152 889 333
E-mail: peter.maloney@xxxxxxxxxxxxxxxxxxxx
Internet: http://www.brockmann-consult.de
--------------------------------------------

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com