Re: RBD exclusive-lock and lqemu/librbd

koukou73gr <koukou73gr@xxxxxxxxx> · Fri, 2 Jun 2017 13:25:06 +0300

On 2017-06-02 13:01, Peter Maloney wrote:
>> Is it easy for you to reproduce it? I had the same problem, and the same
>> solution. But it isn't easy to reproduce... Jason Dillaman asked me for
>> a gcore dump of a hung process but I wasn't able to get one. Can you do
>> that, and when you reply, CC  Jason Dillaman <jdillama@xxxxxxxxxx> ?
> I mean a hung qemu process on the vm host (the one that uses librbd).
> And I guess that should be TO rather than CC.
>

Peter,

Can it be that my situation is different?

In my case the guest/qemu process it self does not hang. The guest root
filesystem resides in an rbd image w/o exclusive-lock enabled (the
pre-existing kind I described).

The problem surfaced when additional storage was attached to the guest,
through a new rbd image created with exclusive-lock as it is the default
on Jewel.

Problem being when parted/fdisk is run on that device, they hang as
reported. On the other hand,

dd if=/dev/sdb of=/tmp/lala count=512

has no problem completing, While the reverse,

dd if=/tmp/lala of=/dev/sdb count=512

hangs indefinately. While in this state, I can still,ssh to the guest
and work as long as I don't touch the new device. It appears that when a
write to the device backed by the exclusive-lock featured image hangs, a
read to it will hang as well.

-K.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com