On Wed, Jan 9, 2019 at 5:17 PM Kenneth Van Alstyne <kvanalstyne@xxxxxxxxxxxxxxx> wrote: > > Hey folks, I’m looking into what I would think would be a simple problem, but is turning out to be more complicated than I would have anticipated. A virtual machine managed by OpenNebula was blown away, but the backing RBD images remain. Upon investigating, it appears > that the images still have watchers on the KVM node that that VM previously lived on. I can confirm that there are no mapped RBD images on the machine and the qemu-system-x86_64 process is indeed no longer running. Any ideas? Additional details are below: > > # rbd info one-73-145-10 > rbd image 'one-73-145-10': > size 1024 GB in 262144 objects > order 22 (4096 kB objects) > block_name_prefix: rbd_data.27174d6b8b4567 > format: 2 > features: layering, exclusive-lock, object-map, fast-diff, deep-flatten > flags: > parent: rbd/one-73@snap > overlap: 102400 kB > # > # rbd status one-73-145-10 > Watchers: > watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880 > # > # > # rados -p rbd listwatchers rbd_header.27174d6b8b4567 > watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880 This appears to be a RADOS (i.e. not a kernel client) watch. Are you sure that nothing of the sort is running on that node? In order for the watch to stay live, the watcher has to send periodic ping messages to the OSD. Perhaps determine the primary OSD with "ceph osd map rbd rbd_header.27174d6b8b4567", set debug_ms to 1 on that OSD and monitor the log for a few minutes? Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com