Production 12.2.2 CephFS Cluster still broken, new Details

Tobias Prousa <tobias.prousa@xxxxxxxxx> · Tue, 12 Dec 2017 09:22:11 +0100

    Hi there,

    regarding my ML post from yesterday (Upgrade from 12.2.1 to 12.2.2
    broke my CephFs) I was able to get a little further with the
    suggested "cephfs-table-tool take_inos <max ino>". This made
    the whole issue with loads of "falsely free-marked inodes" go away.

    I then restarted MDS, kept all clients down so no client has mounted
    FS. Then I started an online MDS scrub

    ceph daemon
    mds.a
    scrub_path / recursive repair

      This again ran for about 3 hours, then MDS again marked FS damaged
      and changes its own state to standby (at least that is what I
      interpret from what I see. This happened exactly at the moment
      when the scrub hit a missing object. See end of logfile (default
      log level):

      2017-12-11 22:29:05.725484 7fc2342bc700  0 log_channel(cluster)
      log [WRN] : bad backtrace on inode
0x1000d3aede3(/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/test-unwanted-simple.sbstore),
      rewriting it

      2017-12-11 22:29:05.725507 7fc2342bc700  0 log_channel(cluster)
      log [WRN] : Scrub error on inode 0x1000d3aede3
(/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/test-unwanted-simple.sbstore)
      see mds.b log and `damage ls` output for details

      2017-12-11 22:29:05.725569 7fc2342bc700 -1 mds.0.scrubstack
      _validate_inode_done scrub error on inode [inode 0x1000d3aede3
      [2,head]
/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/test-unwanted-simple.sbstore
      auth v382 dirtyparent s=232 n(v0 b232 1=1+0) (iversion lock) |
      dirtyparent=1 scrubqueue=0 0x55ef37c83200]:
{"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-61,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(0)0x1000d3aede3:[<0x1000d3aeda7/test-unwanted-simple.sbstore
      v382>,<0x10002de79e8/safebrowsing
      v7142119>,<0x10002de79df/dsjf5siv.default
      v4089757>,<0x10002de79de/firefox
      v3998050>,<0x10002de79dd/mozilla
      v4933047>,<0x100018bd837/.cache
      v115551644>,<0x10000000000/some_username
      v444724510>,<0x1/home v228039388>]//","error_str":"failed
      to read off disk; see
retval"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-61}

      2017-12-11 22:29:05.729992 7fc2342bc700  0 log_channel(cluster)
      log [WRN] : bad backtrace on inode
0x1000d3aedf1(/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/testexcept-flashsubdoc-simple.sbstore),
      rewriting it

      2017-12-11 22:29:05.730022 7fc2342bc700  0 log_channel(cluster)
      log [WRN] : Scrub error on inode 0x1000d3aedf1
(/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/testexcept-flashsubdoc-simple.sbstore)
      see mds.b log and `damage ls` output for details

      2017-12-11 22:29:05.730077 7fc2342bc700 -1 mds.0.scrubstack
      _validate_inode_done scrub error on inode [inode 0x1000d3aedf1
      [2,head]
/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/testexcept-flashsubdoc-simple.sbstore
      auth v384 dirtyparent s=232 n(v0 b232 1=1+0) (iversion lock) |
      dirtyparent=1 scrubqueue=0 0x55ef3aa38a00]:
{"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-61,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(0)0x1000d3aedf1:[<0x1000d3aeda7/testexcept-flashsubdoc-simple.sbstore
      v384>,<0x10002de79e8/safebrowsing
      v7142119>,<0x10002de79df/dsjf5siv.default
      v4089757>,<0x10002de79de/firefox
      v3998050>,<0x10002de79dd/mozilla
      v4933047>,<0x100018bd837/.cache
      v115551644>,<0x10000000000/some_username
      v444724510>,<0x1/home v228039388>]//","error_str":"failed
      to read off disk; see
retval"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-61}

      2017-12-11 22:29:05.733389 7fc2342bc700  0 log_channel(cluster)
      log [WRN] : bad backtrace on inode
0x1000d3aedb6(/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/test-malware-simple.cache),
      rewriting it

      2017-12-11 22:29:05.733420 7fc2342bc700  0 log_channel(cluster)
      log [WRN] : Scrub error on inode 0x1000d3aedb6
(/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/test-malware-simple.cache)
      see mds.b log and `damage ls` output for details

      2017-12-11 22:29:05.733475 7fc2342bc700 -1 mds.0.scrubstack
      _validate_inode_done scrub error on inode [inode 0x1000d3aedb6
      [2,head]
/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing/test-malware-simple.cache
      auth v366 dirtyparent s=44 n(v0 b44 1=1+0) (iversion lock) |
      dirtyparent=1 scrubqueue=0 0x55ef37c78a00]:
{"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-61,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(0)0x1000d3aedb6:[<0x1000d3aeda7/test-malware-simple.cache
      v366>,<0x10002de79e8/safebrowsing
      v7142119>,<0x10002de79df/dsjf5siv.default
      v4089757>,<0x10002de79de/firefox
      v3998050>,<0x10002de79dd/mozilla
      v4933047>,<0x100018bd837/.cache
      v115551644>,<0x10000000000/some_username
      v444724510>,<0x1/home v228039388>]//","error_str":"failed
      to read off disk; see
retval"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-61}

      2017-12-11 22:29:05.772351 7fc2342bc700  0
      mds.0.cache.dir(0x1000d3ae112) _fetched missing object for [dir
      0x1000d3ae112
/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing-to_delete/
      [2,head] auth v=0 cv=0/0 ap=1+0+0 state=1073741952 f() n()
      hs=0+0,ss=0+0 | waiter=1 authpin=1 0x55eedee27a80]

      2017-12-11 22:29:05.772385 7fc2342bc700 -1 log_channel(cluster)
      log [ERR] : dir 0x1000d3ae112 object missing on disk; some files
      may be lost
(/home/some_username/.cache/mozilla/firefox/dsjf5siv.default/safebrowsing-to_delete)

      2017-12-11 22:29:05.778009 7fc2342bc700  1 mds.b respawn

      2017-12-11 22:29:05.778028 7fc2342bc700  1 mds.b  e:
      '/usr/bin/ceph-mds'

      2017-12-11 22:29:05.778031 7fc2342bc700  1 mds.b  0:
      '/usr/bin/ceph-mds'

      2017-12-11 22:29:05.778036 7fc2342bc700  1 mds.b  1: '-i'

      2017-12-11 22:29:05.778038 7fc2342bc700  1 mds.b  2: 'b'

      2017-12-11 22:29:05.778040 7fc2342bc700  1 mds.b  3: '--pid-file'

      2017-12-11 22:29:05.778042 7fc2342bc700  1 mds.b  4:
      '/var/run/ceph/mds.b.pid'

      2017-12-11 22:29:05.778044 7fc2342bc700  1 mds.b  5: '-c'

      2017-12-11 22:29:05.778046 7fc2342bc700  1 mds.b  6:
      '/etc/ceph/ceph.conf'

      2017-12-11 22:29:05.778048 7fc2342bc700  1 mds.b  7: '--cluster'

      2017-12-11 22:29:05.778050 7fc2342bc700  1 mds.b  8: 'ceph'

      2017-12-11 22:29:05.778051 7fc2342bc700  1 mds.b  9: '--setuser'

      2017-12-11 22:29:05.778053 7fc2342bc700  1 mds.b  10: 'ceph'

      2017-12-11 22:29:05.778055 7fc2342bc700  1 mds.b  11: '--setgroup'

      2017-12-11 22:29:05.778057 7fc2342bc700  1 mds.b  12: 'ceph'

      2017-12-11 22:29:05.778104 7fc2342bc700  1 mds.b respawning with
      exe /usr/bin/ceph-mds

      2017-12-11 22:29:05.778107 7fc2342bc700  1 mds.b  exe_path
      /proc/self/exe

      2017-12-11 22:29:06.186020 7f9ad28f41c0  0 ceph version 12.2.2
      (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable),
      process (unknown), pid 3214

      2017-12-11 22:29:10.604701 7f9acbb38700  1 mds.b handle_mds_map
      standby

    As long as MDS was still active, "damage ls" again gave me exactly
    10001 damages of damage_type "backtrace". Log implies that those
    backtraces cannot be fixed automatically. I could live with losing
    those 10k files, but I do not get why MDS switches to "standby" and
    marks FS damaged rendering it offline.

    ceph -s then reports something like: mds: cephfs-0/1/1 1:damaged
    1:standby (not pasted but manually typed from my memory)

    Btw. in the log the MDS encountered two more "object
      missing on disk; some files may be lost" much earlier during that
      scrub (so three in total), but the first two did not make the MDS
      going to standby.

      I marked FS repaired, restarted MDS with mdf debug level 20 and
      reran a scrub on that particular path but this time MDS wouldn't
      mark whole FS damaged and stayed active. Will it only do so when
      finding three of those damages in a row?

      Is this a bug or is there something I would have to do to my
      cluster to get it back to stable working condition? Again, all
      this began with upgrading from 12.2.1 to 12.2.2.

      Furthermore, is there a way to get rid of those "broken" files
      (either bad backtrace or even more important those with missing
      objects) as I could live with losing certain files if it helps
      getting CephFS working stable again.

      Again, any help is highly appreciated, I need to get the FS back
      up as soon as possible. Thank you very much!

      Best regards, 

      Tobi

    -- 
-----------------------------------------------------------
Dipl.-Inf. (FH) Tobias Prousa
Leiter Entwicklung Datenlogger

CAETEC GmbH
Industriestr. 1
D-82140 Olching
www.caetec.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Olching
Handelsregister: Amtsgericht München, HRB 183929
Geschäftsführung: Stephan Bacher, Andreas Wocke

Tel.: +49 (0)8142 / 50 13 60
Fax.: +49 (0)8142 / 50 13 69

eMail: tobias.prousa@xxxxxxxxx
Web:   http://www.caetec.de
------------------------------------------------------------

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com