Re: How to do quiesced rbd snapshot in libvirt?

Василий Ангапов <angapov@xxxxxxxxx> · Wed, 13 Jan 2016 17:22:02 +0800

Hello again!

Unfortunately I have to raise the problem again. I have constantly
hanging snapshots on several images.
My Ceph version is now 0.94.5.
RBD CLI always giving me this:
root@slpeah001:[~]:# rbd snap create
volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd --snap test
2016-01-13 12:04:39.107166 7fb70e4c2880 -1 librbd::ImageWatcher:
0x427a710 no lock owners detected
2016-01-13 12:04:44.108783 7fb70e4c2880 -1 librbd::ImageWatcher:
0x427a710 no lock owners detected
2016-01-13 12:04:49.110321 7fb70e4c2880 -1 librbd::ImageWatcher:
0x427a710 no lock owners detected
2016-01-13 12:04:54.112373 7fb70e4c2880 -1 librbd::ImageWatcher:
0x427a710 no lock owners detected

I turned "debug rbd = 20" and found this records only on one of OSDs
(on the same host as RBD client):
2016-01-13 11:44:46.076780 7fb5f05d8700  0 --
192.168.252.11:6804/407141 >> 192.168.252.11:6800/407122
pipe(0x392d2000 sd=257 :6804 s=2 pgs=17 cs=1 l=0 c=0x383b4160).fault
with nothing to send, going to standby
2016-01-13 11:58:26.261460 7fb5efbce700  0 --
192.168.252.11:6804/407141 >> 192.168.252.11:6802/407124
pipe(0x39e45000 sd=156 :6804 s=2 pgs=17 cs=1 l=0 c=0x386fbb20).fault
with nothing to send, going to standby
2016-01-13 12:04:23.948931 7fb5fede2700  0 --
192.168.254.11:6804/407141 submit_message watch-notify(notify_complete
(2) cookie 44850800 notify 99720550678667 ret -110) v3 remote,
192.168.254.11:0/1468572, failed lossy con, dropping message
0x3ab76fc0
2016-01-13 12:09:04.254329 7fb5fede2700  0 --
192.168.254.11:6804/407141 submit_message watch-notify(notify_complete
(2) cookie 69846112 notify 99720550678721 ret -110) v3 remote,
192.168.254.11:0/1509673, failed lossy con, dropping message
0x3830cb40

Here is the image properties
root@slpeah001:[~]:# rbd info
volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
rbd image 'volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd':
        size 200 GB in 51200 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.2f2a81562fea59
        format: 2
        features: layering, striping, exclusive, object map
        flags:
        stripe unit: 4096 kB
        stripe count: 1
root@slpeah001:[~]:# rbd status
volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
Watchers:
        watcher=192.168.254.17:0/2088291 client.3424561 cookie=93888518795008
root@slpeah001:[~]:# rbd lock list
volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
There is 1 exclusive lock on this image.
Locker         ID                  Address
client.3424561 auto 93888518795008 192.168.254.17:0/2088291

Also taking RBD snapshots from python API also is hanging...
This image is being used by libvirt.

Any suggestions?
Thanks!

Regards, Vasily.

2016-01-06 1:11 GMT+08:00 Мистер Сёма <angapov@xxxxxxxxx>:
> Well, I believe the problem is no more valid.
> My code before was:
> virsh qemu-agent-command $INSTANCE '{"execute":"guest-fsfreeze-freeze"}'
> rbd snap create $RBD_ID --snap `date +%F-%T`
>
> and then snapshot creation was hanging forever. I inserted a 2 second sleep.
>
> My code after
> virsh qemu-agent-command $INSTANCE '{"execute":"guest-fsfreeze-freeze"}'
> sleep 2
> rbd snap create $RBD_ID --snap `date +%F-%T`
>
> And now it works perfectly. Again, I have no idea, how it solved the problem.
> Thanks :)
>
> 2016-01-06 0:49 GMT+08:00 Мистер Сёма <angapov@xxxxxxxxx>:
>> I am very sorry, but I am not able to increase log versbosity because
>> it's a production cluster with very limited space for logs. Sounds
>> crazy, but that's it.
>> I have found out that the RBD snapshot process hangs forever only when
>> QEMU fsfreeze was issued just before the snapshot. If the guest is not
>> frozen - snapshot is taken with no problem... I have absolutely no
>> idea how these two things could be related to each other... And again
>> this issue occurs only when there is an exclusive lock on image and
>> exclusive lock feature is enabled also on it.
>>
>> Do somebody else have such a problem?
>>
>> 2016-01-05 2:55 GMT+08:00 Jason Dillaman <dillaman@xxxxxxxxxx>:
>>> I am surprised by the error you are seeing with exclusive lock enabled.  The rbd CLI should be able to send the 'snap create' request to QEMU without an error.  Are you able to provide "debug rbd = 20" logs from shortly before and after your snapshot attempt?
>>>
>>> --
>>>
>>> Jason Dillaman
>>>
>>>
>>> ----- Original Message -----
>>>> From: "Мистер Сёма" <angapov@xxxxxxxxx>
>>>> To: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
>>>> Sent: Monday, January 4, 2016 12:37:07 PM
>>>> Subject:  How to do quiesced rbd snapshot in libvirt?
>>>>
>>>> Hello,
>>>>
>>>> Can anyone please tell me what is the right way to do quiesced RBD
>>>> snapshots in libvirt (OpenStack)?
>>>> My Ceph version is 0.94.3.
>>>>
>>>> I found two possible ways, none of them is working for me. Wonder if
>>>> I'm doing something wrong:
>>>> 1) Do VM fsFreeze through QEMU guest agent, perform RBD snapshot, do
>>>> fsThaw. Looks good but the bad thing here is that libvirt uses
>>>> exclusive lock on image, which results in errors like that when taking
>>>> snapshot: " 7f359d304880 -1 librbd::ImageWatcher: no lock owners
>>>> detected". It seems like rbd client is trying to take snapshot on
>>>> behalf of exclusive lock owner but is unable to find this owner.
>>>> Without an exclusive lock everything is working nice.
>>>>
>>>> 2)  Performing QEMU external snapshots with local QCOW2 file being
>>>> overlayed on top of RBD image. This seems really interesting but the
>>>> bad thing is that there is no way currently to remove this kind of
>>>> snapshot because active blockcommit is not currently working for RBD
>>>> images (https://bugzilla.redhat.com/show_bug.cgi?id=1189998).
>>>>
>>>> So again my question is: how do you guys take quiesced RBD snapshots in
>>>> libvirt?
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com