Re: Another OSD broken today. How can I recover it?

Gonzalo Aguilar Delgado <gaguilar@xxxxxxxxxxxxxxxxxx> · Tue, 5 Dec 2017 09:18:46 +0100



    Hi, 

    
    I created this. http://paste.debian.net/999172/ But the
      expiration date is too short. So I did this too
      https://pastebin.com/QfrE71Dg. 

    
    What I want to mention is that there's no known cause for what's
      happening. It's true that time desynch happens on reboot because
      few millis skew. But ntp corrects it fast. There are no network
      issues and the log of the osd is in the output. 

    
    I only see in other osd the errors that are becoming more and
      more usual:
    2017-12-05 08:58:56.637773 7f0feff7f700 -1 log_channel(cluster)
      log [ERR] : 10.7a shard 2: soid
      10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head
      data_digest 0xfae07534 != data_digest 0xe2de2a76 from auth oi
      10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head(3873'5250781
      client.5697316.0:51282235 dirty|data_digest|omap_digest s 4194304
      uv 5250781 dd e2de2a76 od ffffffff alloc_hint [0 0])

      2017-12-05 08:58:56.637775 7f0feff7f700 -1 log_channel(cluster)
      log [ERR] : 10.7a shard 6: soid
      10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head
      data_digest 0xfae07534 != data_digest 0xe2de2a76 from auth oi
      10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head(3873'5250781
      client.5697316.0:51282235 dirty|data_digest|omap_digest s 4194304
      uv 5250781 dd e2de2a76 od ffffffff alloc_hint [0 0])

      2017-12-05 08:58:56.637777 7f0feff7f700 -1 log_channel(cluster)
      log [ERR] : 10.7a soid
      10:5ff4f7a3:::rbd_data.56bf3a4775a618.0000000000002efa:head:
      failed to pick suitable auth object

    
    Digests not matching basically. Someone told me that this can be
      caused by a faulty disk. So I replaced the offending drive, and
      now I found the new disk is happening the same. Ok. But this
      thread is not for checking the source of the problem. This will be
      done later. 

    
    This thread is to try recover an OSD that seems ok to the object
      store tool. This is:
    

    Why it breaks here?
    

       starting osd.4
        at :/0 osd_data /var/lib/ceph/osd/ceph-4
        /var/lib/ceph/osd/ceph-4/journal

        osd/PG.cc: In function 'static int
        PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*,
        ceph::bufferlist*)' thread 7f467ba0b8c0 time 2017-12-03
        13:39:29.495311

        osd/PG.cc: 3025: FAILED assert(values.size() == 2)

         ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)

         1: (ceph::__ceph_assert_fail(char const*, char const*, int,
        char const*)+0x80)
        [0x5556eab28790]                                 <---------
        HERE

         2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*,
        ceph::buffer::list*)+0x661) [0x5556ea4e6601]

         3: (OSD::load_pgs()+0x75a) [0x5556ea43a8aa]

         4: (OSD::init()+0x2026) [0x5556ea445ca6]

         5: (main()+0x2ef1) [0x5556ea3b7301]

         6: (__libc_start_main()+0xf0) [0x7f467886b830]

         7: (_start()+0x29) [0x5556ea3f8b09]

         NOTE: a copy of the executable, or `objdump -rdS
        <executable>` is needed to interpret this.

        2017-12-03 13:39:29.497091 7f467ba0b8c0 -1 osd/PG.cc: In
        function 'static int PG::peek_map_epoch(ObjectStore*, spg_t,
        epoch_t*, ceph::bufferlist*)' thread 7f467ba0b8c0 time
        2017-12-03 13:39:29.495311

        osd/PG.cc: 3025: FAILED assert(values.size() == 2)

        
        So it looks like the offending code is this one:

        
          int r = store->omap_get_values(coll, pgmeta_oid, keys,
        &values);

          if (r == 0) {

            assert(values.size() == 2);     <------ Here

        
            // sanity check version
    

    While the object store
        tool can run it without any problem. As you can see here:
    

    ceph-objectstore-tool
        --debug --op list-pgs --data-path /var/lib/ceph/osd/ceph-4
        --journal-path /dev/sdf3

        2017-12-05 09:18:25.885258 7f5dd8b94a40  0
        filestore(/var/lib/ceph/osd/ceph-4) backend xfs (magic
        0x58465342)

        2017-12-05 09:18:25.885715 7f5dd8b94a40  0
        genericfilestorebackend(/var/lib/ceph/osd/ceph-4)
        detect_features: FIEMAP ioctl is disabled via 'filestore fiemap'
        config option

        2017-12-05 09:18:25.885734 7f5dd8b94a40  0
        genericfilestorebackend(/var/lib/ceph/osd/ceph-4)
        detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore
        seek data hole' config option

        2017-12-05 09:18:25.885755 7f5dd8b94a40  0
        genericfilestorebackend(/var/lib/ceph/osd/ceph-4)
        detect_features: splice is supported

        2017-12-05 09:18:25.910484 7f5dd8b94a40  0
        genericfilestorebackend(/var/lib/ceph/osd/ceph-4)
        detect_features: syncfs(2) syscall fully supported (by glibc and
        kernel)

        2017-12-05 09:18:25.910545 7f5dd8b94a40  0
        xfsfilestorebackend(/var/lib/ceph/osd/ceph-4) detect_feature:
        extsize is disabled by conf

        2017-12-05 09:18:26.639796 7f5dd8b94a40  0
        filestore(/var/lib/ceph/osd/ceph-4) mount: enabling WRITEAHEAD
        journal mode: checkpoint is not enabled

        2017-12-05 09:18:26.650560 7f5dd8b94a40  1 journal _open
        /dev/sdf3 fd 11: 5368709120 bytes, block size 4096 bytes,
        directio = 1, aio = 1

        2017-12-05 09:18:26.662606 7f5dd8b94a40  1 journal _open
        /dev/sdf3 fd 11: 5368709120 bytes, block size 4096 bytes,
        directio = 1, aio = 1

        2017-12-05 09:18:26.664869 7f5dd8b94a40  1
        filestore(/var/lib/ceph/osd/ceph-4) upgrade

        Cluster fsid=9028f4da-0d77-462b-be9b-dbdf7fa57771

        Supported features: compat={},rocompat={},incompat={1=initial
        feature set(~v.18),2=pginfo object,3=object
locator,4=last_epoch_clean,5=categories,6=hobjectpool,7=biginfo,8=leveldbinfo,9=leveldblog,10=snapmapper,11=sharded
        objects,12=transaction hints,13=pg meta object}

        On-disk features: compat={},rocompat={},incompat={1=initial
        feature set(~v.18),2=pginfo object,3=object
locator,4=last_epoch_clean,5=categories,6=hobjectpool,7=biginfo,8=leveldbinfo,9=leveldblog,10=snapmapper,11=sharded
        objects,12=transaction hints,13=pg meta object}

        Performing list-pgs operation

        ....

      
    On 04/12/17 12:21, Ronny Aasen wrote:

    
    ceph
      health detail
    
  
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com