Re: Lost OSD - 1000: FAILED assert(r == 0)

Igor Fedotov <ifedotov@xxxxxxx> · Fri, 24 May 2019 16:56:10 +0300



    Hi Guillaume,

    
    Could you please set debug-bluefs to 20, restart OSD and collect
      the whole log.
    

    Thanks,
    Igor

    
    On 5/24/2019 4:50 PM, Guillaume Chenuet
      wrote:

    
      Hi,
        

        We are running a Ceph cluster with 36 OSD splitted on 3
          servers (12 OSD per server) and Ceph version
          12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous
          (stable).
        

        This cluster is used by an OpenStack private cloud and
          deployed with OpenStack Kolla. Every OSD ran into a Docker
          container on the server and MON, MGR, MDS, and RGW are running
          on 3 other servers.
        

        This week, one OSD crashed and failed to restart, with this
          stack trace:
        

         Running
            command: '/usr/bin/ceph-osd -f --public-addr 10.106.142.30
            --cluster-addr 10.106.142.30 -i 35'
        + exec
          /usr/bin/ceph-osd -f --public-addr 10.106.142.30
          --cluster-addr 10.106.142.30 -i 35

          starting osd.35 at - osd_data /var/lib/ceph/osd/ceph-35
          /var/lib/ceph/osd/ceph-35/journal

          /builddir/build/BUILD/ceph-12.2.11/src/os/bluestore/BlueFS.cc:
          In function 'int BlueFS::_read(BlueFS::FileReader*,
          BlueFS::FileReaderBuffer*, uint64_t, size_t,
          ceph::bufferlist*, char*)' thread 7efd088d6d80 time 2019-05-24
          05:40:47.799918

          /builddir/build/BUILD/ceph-12.2.11/src/os/bluestore/BlueFS.cc:
          1000: FAILED assert(r == 0)

           ceph version 12.2.11
          (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)

           1: (ceph::__ceph_assert_fail(char const*, char const*, int,
          char const*)+0x110) [0x556f7833f8f0]

           2: (BlueFS::_read(BlueFS::FileReader*,
          BlueFS::FileReaderBuffer*, unsigned long, unsigned long,
          ceph::buffer::list*, char*)+0xca4) [0x556f782b5574]

           3: (BlueFS::_replay(bool)+0x2ef) [0x556f782c82af]

           4: (BlueFS::mount()+0x1d4) [0x556f782cc014]

           5: (BlueStore::_open_db(bool)+0x1847) [0x556f781e0ce7]

           6: (BlueStore::_mount(bool)+0x40e) [0x556f782126ae]

           7: (OSD::init()+0x3bd) [0x556f77dbbaed]

           8: (main()+0x2d07) [0x556f77cbe667]

           9: (__libc_start_main()+0xf5) [0x7efd04fa63d5]

           10: (()+0x4c1f73) [0x556f77d5ef73]

           NOTE: a copy of the executable, or `objdump -rdS
          <executable>` is needed to interpret this.

          *** Caught signal (Aborted) **

           in thread 7efd088d6d80 thread_name:ceph-osd

           ceph version 12.2.11
          (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)

           1: (()+0xa63931) [0x556f78300931]

           2: (()+0xf5d0) [0x7efd05f995d0]

           3: (gsignal()+0x37) [0x7efd04fba207]

           4: (abort()+0x148) [0x7efd04fbb8f8]

           5: (ceph::__ceph_assert_fail(char const*, char const*, int,
          char const*)+0x284) [0x556f7833fa64]

           6: (BlueFS::_read(BlueFS::FileReader*,
          BlueFS::FileReaderBuffer*, unsigned long, unsigned long,
          ceph::buffer::list*, char*)+0xca4) [0x556f782b5574]

           7: (BlueFS::_replay(bool)+0x2ef) [0x556f782c82af]

           8: (BlueFS::mount()+0x1d4) [0x556f782cc014]

           9: (BlueStore::_open_db(bool)+0x1847) [0x556f781e0ce7]

           10: (BlueStore::_mount(bool)+0x40e) [0x556f782126ae]

           11: (OSD::init()+0x3bd) [0x556f77dbbaed]

           12: (main()+0x2d07) [0x556f77cbe667]

           13: (__libc_start_main()+0xf5) [0x7efd04fa63d5]

           14: (()+0x4c1f73) [0x556f77d5ef73]

        
        The cluster health is OK and Ceph sees this OSD as
          shutdown.
        

        I tried to find more information on the internet about this
          error without luck.
        Do you have any idea or input about this error, please?
        

        Thanks,
        Guillaume
        

      _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

    
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com