Re: Image has watchers, but cannot determine why

Ilya Dryomov <idryomov@xxxxxxxxx> · Thu, 10 Jan 2019 11:03:05 +0100

On Wed, Jan 9, 2019 at 5:17 PM Kenneth Van Alstyne
<kvanalstyne@xxxxxxxxxxxxxxx> wrote:
>
> Hey folks, I’m looking into what I would think would be a simple problem, but is turning out to be more complicated than I would have anticipated.   A virtual machine managed by OpenNebula was blown away, but the backing RBD images remain.  Upon investigating, it appears
> that the images still have watchers on the KVM node that that VM previously lived on.  I can confirm that there are no mapped RBD images on the machine and the qemu-system-x86_64 process is indeed no longer running.  Any ideas?  Additional details are below:
>
> # rbd info one-73-145-10
> rbd image 'one-73-145-10':
> size 1024 GB in 262144 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.27174d6b8b4567
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
> flags:
> parent: rbd/one-73@snap
> overlap: 102400 kB
> #
> # rbd status one-73-145-10
> Watchers:
> watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880
> #
> #
> # rados -p rbd listwatchers rbd_header.27174d6b8b4567
> watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880

This appears to be a RADOS (i.e. not a kernel client) watch.  Are you
sure that nothing of the sort is running on that node?

In order for the watch to stay live, the watcher has to send periodic
ping messages to the OSD.  Perhaps determine the primary OSD with "ceph
osd map rbd rbd_header.27174d6b8b4567", set debug_ms to 1 on that OSD
and monitor the log for a few minutes?

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com