Re: MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

Sergey Malinin <hell@xxxxxxxxxxx> · Sun, 7 Oct 2018 13:52:47 +0300

I was able to start MDS and mount the fs with broken ownership/permissions and 8k out of millions files in lost+found.

On 7.10.2018, at 02:04, Sergey Malinin <hell@xxxxxxxxxxx> wrote:

I'm at scan_links now, will post an update once it has finished.Have you reset the journal after fs recovery as suggested in the doc?

quote:

If the damaged filesystem contains dirty journal data, it may be recovered next with:

cephfs-journal-tool --rank=<original filesystem name>:0 event recover_dentries list --alternate-pool recovery
cephfs-journal-tool --rank recovery-fs:0 journal reset --force

On 7.10.2018, at 00:36, Alfredo Daniel Rezinovsky <alfrenovsky@xxxxxxxxx> wrote:

  I did something wrong in the upgrade restart also...
after rescaning with:
cephfs-data-scan scan_extents cephfs_data (with threads)
cephfs-data-scan scan_inodes cephfs_data (with threads)
    cephfs-data-scan scan_links

    My MDS still crashes and wont replay.

     1: (()+0x3ec320) [0x55b0e2bd2320]

     2: (()+0x12890) [0x7fc3adce3890]

     3: (gsignal()+0xc7) [0x7fc3acddbe97]

     4: (abort()+0x141) [0x7fc3acddd801]

     5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
    const*)+0x250) [0x7fc3ae3cc080]

     6: (()+0x26c0f7) [0x7fc3ae3cc0f7]

     7: (()+0x21eb27) [0x55b0e2a04b27]

     8: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*,
    CInode*, snapid_t)+0xc0) [0x55b0e2a04d40]

     9: (Locker::check_inode_max_size(CInode*, bool, unsigned long,
    unsigned long, utime_t)+0x91d) [0x55b0e2a6a0fd]

     10: (RecoveryQueue::_recovered(CInode*, int, unsigned long,
    utime_t)+0x39f) [0x55b0e2a3ca2f]

     11: (MDSIOContextBase::complete(int)+0x119) [0x55b0e2b54ab9]

     12: (Filer::C_Probe::finish(int)+0xe7) [0x55b0e2bd94e7]

     13: (Context::complete(int)+0x9) [0x55b0e28e9719]

     14: (Finisher::finisher_thread_entry()+0x12e) [0x7fc3ae3ca4ce]

     15: (()+0x76db) [0x7fc3adcd86db]

     16: (clone()+0x3f) [0x7fc3acebe88f]

    Did you do somenthing else before starting the MDSs again?

    On 05/10/18 21:17, Sergey Malinin
      wrote:

      I ended up rescanning the entire fs using alternate metadata pool
      approach as in http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
      The process has not competed yet because during the
        recovery our cluster encountered another problem with OSDs that
        I got fixed yesterday (thanks to Igor Fedotov @ SUSE).
        The first stage (scan_extents) completed in 84
          hours (120M objects in data pool on 8 hdd OSDs on 4 hosts).
          The second (scan_inodes) was interrupted by OSDs failure so I
          have no timing stats but it seems to be runing 2-3 times
          faster than extents scan.
        As to root cause -- in my case I recall that
          during upgrade I had forgotten to restart 3 OSDs, one of which
          was holding metadata pool contents, before restarting MDS
          daemons and that seemed to had an impact on MDS journal
          corruption, because when I restarted those OSDs, MDS was able
          to start up but soon failed throwing lots of 'loaded dup
          inode' errors.

          On 6.10.2018, at 00:41, Alfredo Daniel
            Rezinovsky <alfrenovsky@xxxxxxxxx>
            wrote:

            Same problem...

              # cephfs-journal-tool --journal=purge_queue journal
              inspect

              2018-10-05 18:37:10.704 7f01f60a9bc0 -1 Missing object
              500.0000016c

              Overall journal integrity: DAMAGED

              Objects missing:

                0x16c

              Corrupt regions:

                0x5b000000-ffffffffffffffff

              Just after upgrade to 13.2.2

              Did you fixed it?

              On 26/09/18 13:05, Sergey Malinin wrote:

              Hello,

                Followed standard upgrade procedure to upgrade from
                13.2.1 to 13.2.2.

                After upgrade MDS cluster is down, mds rank 0 and
                purge_queue journal are damaged. Resetting purge_queue
                does not seem to work well as journal still appears to
                be damaged.

                Can anybody help?

                mds log:

                  -789> 2018-09-26 18:42:32.527 7f70f78b1700  1
                mds.mds2 Updating MDS map to version 586 from mon.2

                  -788> 2018-09-26 18:42:32.527 7f70f78b1700  1
                mds.0.583 handle_mds_map i am now mds.0.583

                  -787> 2018-09-26 18:42:32.527 7f70f78b1700  1
                mds.0.583 handle_mds_map state change up:rejoin -->
                up:active

                  -786> 2018-09-26 18:42:32.527 7f70f78b1700  1
                mds.0.583 recovery_done -- successful recovery!

                <skip>

                   -38> 2018-09-26 18:42:32.707 7f70f28a7700 -1
                mds.0.purge_queue _consume: Decode error at
                read_pos=0x322ec6636

                   -37> 2018-09-26 18:42:32.707 7f70f28a7700  5
                mds.beacon.mds2 set_want_state: up:active ->
                down:damaged

                   -36> 2018-09-26 18:42:32.707 7f70f28a7700  5
                mds.beacon.mds2 _send down:damaged seq 137

                   -35> 2018-09-26 18:42:32.707 7f70f28a7700 10
                monclient: _send_mon_message to mon.ceph3 at mon:6789/0

                   -34> 2018-09-26 18:42:32.707 7f70f28a7700  1 --
                mds:6800/e4cc09cf --> mon:6789/0 --
                mdsbeacon(14c72/mds2 down:damaged seq 137 v24a) v7 --
                0x563b321ad480 con 0

                <skip>

                    -3> 2018-09-26 18:42:32.743 7f70f98b5700  5 --
                mds:6800/3838577103 >> mon:6789/0
                conn(0x563b3213e000 :-1
                s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=8 cs=1
                l=1). rx mon.2 seq 29 0x563b321ab880 mdsbeaco

                n(85106/mds2 down:damaged seq 311 v587) v7

                    -2> 2018-09-26 18:42:32.743 7f70f98b5700  1 --
                mds:6800/3838577103 <== mon.2 mon:6789/0 29 ====
                mdsbeacon(85106/mds2 down:damaged seq 311 v587) v7 ====
                129+0+0 (3296573291 0 0) 0x563b321ab880 con 0x563b3213e

                000

                    -1> 2018-09-26 18:42:32.743 7f70f98b5700  5
                mds.beacon.mds2 handle_mds_beacon down:damaged seq 311
                rtt 0.038261

                     0> 2018-09-26 18:42:32.743 7f70f28a7700  1
                mds.mds2 respawn!

                # cephfs-journal-tool --journal=purge_queue journal
                inspect

                Overall journal integrity: DAMAGED

                Corrupt regions:

                  0x322ec65d9-ffffffffffffffff

                # cephfs-journal-tool --journal=purge_queue journal
                reset

                old journal was 13470819801~8463

                new journal start will be 13472104448 (1276184 bytes
                past old end)

                writing journal head

                done

                # cephfs-journal-tool --journal=purge_queue journal
                inspect

                2018-09-26 19:00:52.848 7f3f9fa50bc0 -1 Missing object
                500.00000c8c

                Overall journal integrity: DAMAGED

                Objects missing:

                  0xc8c

                Corrupt regions:

                  0x323000000-ffffffffffffffff

                _______________________________________________

                ceph-users mailing list

                ceph-users@xxxxxxxxxxxxxx

                http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com