My immediate guess is that your caps are incorrect for your OpenStack Ceph user. Please refer to step 6 from the Luminous upgrade guide to ensure your RBD users have permission to blacklist dead peers [1] [1] http://docs.ceph.com/docs/master/releases/luminous/#upgrade-from-jewel-or-kraken On Thu, May 10, 2018 at 9:49 AM, Jonathan Proulx <jon@xxxxxxxxxxxxx> wrote: > Hi All, > > recently I saw a number of rbd backed VMs in my openstack cloud fail > to reboot after a hypervisor crash with errors simialr to: > > [ 5.279393] blk_update_request: I/O error, dev vda, sector 2048 > [ 5.281427] Buffer I/O error on dev vda1, logical block 0, lost async page write > [ 5.284114] Buffer I/O error on dev vda1, logical block 1, lost async page write > [ 5.286600] Buffer I/O error on dev vda1, logical block 2, lost async page write > [ 5.289022] Buffer I/O error on dev vda1, logical block 3, lost async page write > [ 5.291515] Buffer I/O error on dev vda1, logical block 4, lost async page write > [ 5.338981] blk_update_request: I/O error, dev vda, sector 3088 > > for many blocks and sectors. I was able to export the rbd images and > they seemed fine, also 'rbd flatten' made them boot again with no > errors. > > I found this puzzling and concerning but given the crash and limited > time didn't really follow up. > > Today I intetionally rebooted a VM on a health hypervisor and had it > land in the same condition, now I'm really worried. > > running: > Ubuntu16.04 > ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable) (on hypervisor) > { > "mon": { > "ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)": 3 > }, > "mgr": { > "ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)": 3 > }, > "osd": { > "ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable)": 102, > "ceph version 12.2.3 (2dab17a455c09584f2a85e6b10888337d1ec8949) luminous (stable)": 10, > "ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)": 62 > } > } > libvirt-bin 1.3.1-1ubuntu10.21 > qemu-system 1:2.5+dfsg-5ubuntu10.24 > OpenStack Mitaka > > Any one seen anything like this or have suggestions where to look for more details? > > -Jon > -- > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com