Hello Jason, I think there is a slight misunderstanding: There is only one *VM*, not one OSD left that we did not start. Or does librbd also read ceph.conf and will that cause qemu to output debug messages? Best, Nico Jason Dillaman <jdillama@xxxxxxxxxx> writes: > I presume QEMU is using librbd instead of a mapped krbd block device, > correct? If that is the case, can you add "debug-rbd=20" and "debug > objecter=20" to your ceph.conf and boot up your last remaining broken > OSD? > > On Sun, Sep 10, 2017 at 8:23 AM, Nico Schottelius > <nico.schottelius@xxxxxxxxxxx> wrote: >> >> Good morning, >> >> yesterday we had an unpleasant surprise that I would like to discuss: >> >> Many (not all!) of our VMs were suddenly >> dying (qemu process exiting) and when trying to restart them, inside the >> qemu process we saw i/o errors on the disks and the OS was not able to >> start (i.e. stopped in initramfs). >> >> When we exported the image from rbd and loop mounted it, there were >> however no I/O errors and the filesystem could be cleanly mounted [-1]. >> >> We are running Devuan with kernel 3.16.0-4-amd64 and saw that there are >> some problems reported with kernels < 3.16.39 and thus we upgraded one >> host that serves as VM host + runs ceph osds to Devuan ascii using >> 4.9.0-3-amd64. >> >> Trying to start the VM again on this host however resulted in the same >> I/O problem. >> >> We then did the "stupid" approach of exporting an image and importing it >> again as the same name [0]. Surprisingly, this solved our problem >> reproducible for all affected VMs and allowed us to go back online. >> >> We intentionally left one broken VM in our system (a test VM) so that we >> have the chance of debugging further what happened and how we can >> prevent it from happening again. >> >> As you might have guessed, there have been some event prior this: >> >> - Some weeks before we upgraded our cluster from kraken to luminous (in >> the right order of mon's first, adding mgrs) >> >> - About a week ago we added the first hdd to our cluster and modified the >> crushmap so that it the "one" pool (from opennebula) still selects >> only ssds >> >> - Some hours before we took out one of the 5 hosts of the ceph cluster, >> as we intended to replace the filesystem based OSDs with bluestore >> (roughly 3 hours prior to the event) >> >> - Short time before the event we readded an osd, but did not "up" it >> >> To our understanding, none of these actions should have triggered this >> behaviour, however we are aware that with the upgrade to luminous also >> the client libraries were updated and not all qemu processes were >> restarted. [1] >> >> After this long story, I was wondering about the following things: >> >> - Why did this happen at all? >> And what is different after we reimported the image? >> Can it be related to disconnected the image from the parent >> (i.e. opennebula creates clones prior to starting a VM) >> >> - We have one broken VM left - is there a way to get it back running >> without doing the export/import dance? >> >> - How / or is http://tracker.ceph.com/issues/18807 related to our issue? >> How is the kernel involved into running VMs that use librbd? >> rbd showmapped does not show any mapped VMs, as qemu connects directly >> to ceph. >> >> We tried upgrading one host to Devuan ascii which uses 4.9.0-3-amd64, >> but did not fix our problem. >> >> We would appreciate any pointer! >> >> Best, >> >> Nico >> >> >> [-1] >> losetup -P /dev/loop0 /var/tmp/one-staging/monitoring1-disk.img >> mkdir /tmp/monitoring1-mnt >> mount /dev/loop0p1 /tmp/monitoring1-mnt/ >> >> >> [0] >> >> rbd export one/$img /var/tmp/one-staging/$img >> rbd rm one/$img >> rbd import /var/tmp/one-staging/$img one/$img >> rm /var/tmp/one-staging/$img >> >> [1] >> [14:05:34] server5:~# ceph features >> { >> "mon": { >> "group": { >> "features": "0x1ffddff8eea4fffb", >> "release": "luminous", >> "num": 3 >> } >> }, >> "osd": { >> "group": { >> "features": "0x1ffddff8eea4fffb", >> "release": "luminous", >> "num": 49 >> } >> }, >> "client": { >> "group": { >> "features": "0xffddff8ee84fffb", >> "release": "kraken", >> "num": 1 >> }, >> "group": { >> "features": "0xffddff8eea4fffb", >> "release": "luminous", >> "num": 4 >> }, >> "group": { >> "features": "0x1ffddff8eea4fffb", >> "release": "luminous", >> "num": 61 >> } >> } >> } >> >> >> -- >> Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com