Sorry -- meant VM. Yes, librbd uses ceph.conf for configuration settings. On Sun, Sep 10, 2017 at 9:22 AM, Nico Schottelius <nico.schottelius@xxxxxxxxxxx> wrote: > > Hello Jason, > > I think there is a slight misunderstanding: > There is only one *VM*, not one OSD left that we did not start. > > Or does librbd also read ceph.conf and will that cause qemu to output > debug messages? > > Best, > > Nico > > Jason Dillaman <jdillama@xxxxxxxxxx> writes: > >> I presume QEMU is using librbd instead of a mapped krbd block device, >> correct? If that is the case, can you add "debug-rbd=20" and "debug >> objecter=20" to your ceph.conf and boot up your last remaining broken >> OSD? >> >> On Sun, Sep 10, 2017 at 8:23 AM, Nico Schottelius >> <nico.schottelius@xxxxxxxxxxx> wrote: >>> >>> Good morning, >>> >>> yesterday we had an unpleasant surprise that I would like to discuss: >>> >>> Many (not all!) of our VMs were suddenly >>> dying (qemu process exiting) and when trying to restart them, inside the >>> qemu process we saw i/o errors on the disks and the OS was not able to >>> start (i.e. stopped in initramfs). >>> >>> When we exported the image from rbd and loop mounted it, there were >>> however no I/O errors and the filesystem could be cleanly mounted [-1]. >>> >>> We are running Devuan with kernel 3.16.0-4-amd64 and saw that there are >>> some problems reported with kernels < 3.16.39 and thus we upgraded one >>> host that serves as VM host + runs ceph osds to Devuan ascii using >>> 4.9.0-3-amd64. >>> >>> Trying to start the VM again on this host however resulted in the same >>> I/O problem. >>> >>> We then did the "stupid" approach of exporting an image and importing it >>> again as the same name [0]. Surprisingly, this solved our problem >>> reproducible for all affected VMs and allowed us to go back online. >>> >>> We intentionally left one broken VM in our system (a test VM) so that we >>> have the chance of debugging further what happened and how we can >>> prevent it from happening again. >>> >>> As you might have guessed, there have been some event prior this: >>> >>> - Some weeks before we upgraded our cluster from kraken to luminous (in >>> the right order of mon's first, adding mgrs) >>> >>> - About a week ago we added the first hdd to our cluster and modified the >>> crushmap so that it the "one" pool (from opennebula) still selects >>> only ssds >>> >>> - Some hours before we took out one of the 5 hosts of the ceph cluster, >>> as we intended to replace the filesystem based OSDs with bluestore >>> (roughly 3 hours prior to the event) >>> >>> - Short time before the event we readded an osd, but did not "up" it >>> >>> To our understanding, none of these actions should have triggered this >>> behaviour, however we are aware that with the upgrade to luminous also >>> the client libraries were updated and not all qemu processes were >>> restarted. [1] >>> >>> After this long story, I was wondering about the following things: >>> >>> - Why did this happen at all? >>> And what is different after we reimported the image? >>> Can it be related to disconnected the image from the parent >>> (i.e. opennebula creates clones prior to starting a VM) >>> >>> - We have one broken VM left - is there a way to get it back running >>> without doing the export/import dance? >>> >>> - How / or is http://tracker.ceph.com/issues/18807 related to our issue? >>> How is the kernel involved into running VMs that use librbd? >>> rbd showmapped does not show any mapped VMs, as qemu connects directly >>> to ceph. >>> >>> We tried upgrading one host to Devuan ascii which uses 4.9.0-3-amd64, >>> but did not fix our problem. >>> >>> We would appreciate any pointer! >>> >>> Best, >>> >>> Nico >>> >>> >>> [-1] >>> losetup -P /dev/loop0 /var/tmp/one-staging/monitoring1-disk.img >>> mkdir /tmp/monitoring1-mnt >>> mount /dev/loop0p1 /tmp/monitoring1-mnt/ >>> >>> >>> [0] >>> >>> rbd export one/$img /var/tmp/one-staging/$img >>> rbd rm one/$img >>> rbd import /var/tmp/one-staging/$img one/$img >>> rm /var/tmp/one-staging/$img >>> >>> [1] >>> [14:05:34] server5:~# ceph features >>> { >>> "mon": { >>> "group": { >>> "features": "0x1ffddff8eea4fffb", >>> "release": "luminous", >>> "num": 3 >>> } >>> }, >>> "osd": { >>> "group": { >>> "features": "0x1ffddff8eea4fffb", >>> "release": "luminous", >>> "num": 49 >>> } >>> }, >>> "client": { >>> "group": { >>> "features": "0xffddff8ee84fffb", >>> "release": "kraken", >>> "num": 1 >>> }, >>> "group": { >>> "features": "0xffddff8eea4fffb", >>> "release": "luminous", >>> "num": 4 >>> }, >>> "group": { >>> "features": "0x1ffddff8eea4fffb", >>> "release": "luminous", >>> "num": 61 >>> } >>> } >>> } >>> >>> >>> -- >>> Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com