Good morning, yesterday we had an unpleasant surprise that I would like to discuss: Many (not all!) of our VMs were suddenly dying (qemu process exiting) and when trying to restart them, inside the qemu process we saw i/o errors on the disks and the OS was not able to start (i.e. stopped in initramfs). When we exported the image from rbd and loop mounted it, there were however no I/O errors and the filesystem could be cleanly mounted [-1]. We are running Devuan with kernel 3.16.0-4-amd64 and saw that there are some problems reported with kernels < 3.16.39 and thus we upgraded one host that serves as VM host + runs ceph osds to Devuan ascii using 4.9.0-3-amd64. Trying to start the VM again on this host however resulted in the same I/O problem. We then did the "stupid" approach of exporting an image and importing it again as the same name [0]. Surprisingly, this solved our problem reproducible for all affected VMs and allowed us to go back online. We intentionally left one broken VM in our system (a test VM) so that we have the chance of debugging further what happened and how we can prevent it from happening again. As you might have guessed, there have been some event prior this: - Some weeks before we upgraded our cluster from kraken to luminous (in the right order of mon's first, adding mgrs) - About a week ago we added the first hdd to our cluster and modified the crushmap so that it the "one" pool (from opennebula) still selects only ssds - Some hours before we took out one of the 5 hosts of the ceph cluster, as we intended to replace the filesystem based OSDs with bluestore (roughly 3 hours prior to the event) - Short time before the event we readded an osd, but did not "up" it To our understanding, none of these actions should have triggered this behaviour, however we are aware that with the upgrade to luminous also the client libraries were updated and not all qemu processes were restarted. [1] After this long story, I was wondering about the following things: - Why did this happen at all? And what is different after we reimported the image? Can it be related to disconnected the image from the parent (i.e. opennebula creates clones prior to starting a VM) - We have one broken VM left - is there a way to get it back running without doing the export/import dance? - How / or is http://tracker.ceph.com/issues/18807 related to our issue? How is the kernel involved into running VMs that use librbd? rbd showmapped does not show any mapped VMs, as qemu connects directly to ceph. We tried upgrading one host to Devuan ascii which uses 4.9.0-3-amd64, but did not fix our problem. We would appreciate any pointer! Best, Nico [-1] losetup -P /dev/loop0 /var/tmp/one-staging/monitoring1-disk.img mkdir /tmp/monitoring1-mnt mount /dev/loop0p1 /tmp/monitoring1-mnt/ [0] rbd export one/$img /var/tmp/one-staging/$img rbd rm one/$img rbd import /var/tmp/one-staging/$img one/$img rm /var/tmp/one-staging/$img [1] [14:05:34] server5:~# ceph features { "mon": { "group": { "features": "0x1ffddff8eea4fffb", "release": "luminous", "num": 3 } }, "osd": { "group": { "features": "0x1ffddff8eea4fffb", "release": "luminous", "num": 49 } }, "client": { "group": { "features": "0xffddff8ee84fffb", "release": "kraken", "num": 1 }, "group": { "features": "0xffddff8eea4fffb", "release": "luminous", "num": 4 }, "group": { "features": "0x1ffddff8eea4fffb", "release": "luminous", "num": 61 } } } -- Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com