Re: MDS corruption

☣Adam <adam@xxxxxxxxx> · Tue, 13 Aug 2019 01:35:30 -0500

Pierre Dittes helped me with adding --rank=yourfsname:all and I ran the
following steps from the disaster recovery page: journal export, dentry
recovery, journal truncation, mds table wipes (session, snap and inode),
scan_extents, scan_inodes, scan_links, and cleanup.

Now all three of my MDS servers are crashing due to a failed assert.
Logs with stacktrace are included (the other two servers have the same
stacktrace in their logs).

Currently I can't mount cephfs (which makes sense since there aren't any
MDS services up for more than a few minutes before they crash).  Any
suggestions on next steps to troubleshoot/fix this?

Hopefully there's some way to recover from this and I don't have to tell
my users that I lost all the data and we need to go back to the backups.
 It shouldn't be a huge problem if we do, but it'll lose a lot of
confidence in ceph and its ability to keep data safe.

Thanks,
Adam

On 8/8/19 3:31 PM, ☣Adam wrote:
> I had a machine with insufficient memory and it seems to have corrupted
> data on my MDS.  The filesystem seems to be working fine, with the
> exception of accessing specific files.
> 
> The ceph-mds logs include things like:
> mds.0.1596621 unhandled write error (2) No such file or directory, force
> readonly...
> dir 0x1000000fb03 object missing on disk; some files may be lost
> (/adam/programming/bash)
> 
> I'm using mimic and trying to follow the instructions here:
> https://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/
> 
> The punchline is this:
> cephfs-journal-tool --rank all journal export backup.bin
> Error ((22) Invalid argument)
> 2019-08-08 20:02:39.847 7f06827537c0 -1 main: Couldn't determine MDS rank.
> 
> I have a backup (outside of ceph) of all data which is inaccessible and
> I can back anything which is accessible if need be.  There's some more
> information below, but my main question is: what are my next steps?
> 
> On a side note, I'd like to get involved with helping with documentation
> (man pages, the ceph website, usage text, etc). Where can I get started?
> 
> 
> 
> Here's the context:
> 
> cephfs-journal-tool event recover_dentries summary
> Error ((22) Invalid argument)
> 2019-08-08 19:50:04.798 7f21f4ffe7c0 -1 main: missing mandatory "--rank"
> argument
> 
> Seems like a bug in the documentation since `--rank` is a "mandatory
> option" according to the help text.  It looks like the rank of this node
> for MDS is 0, based on `ceph health detail`, but using `--rank 0` or
> `--rank all` doesn't work either:
> 
> ceph health detail
> HEALTH_ERR 1 MDSs report damaged metadata; 1 MDSs are read only
> MDS_DAMAGE 1 MDSs report damaged metadata
>     mdsge.hax0rbana.org(mds.0): Metadata damage detected
> MDS_READ_ONLY 1 MDSs are read only
>     mdsge.hax0rbana.org(mds.0): MDS in read-only mode
> 
> cephfs-journal-tool --rank 0 event recover_dentries summary
> Error ((22) Invalid argument)
> 2019-08-08 19:54:45.583 7f5b37c4c7c0 -1 main: Couldn't determine MDS rank.
> 
> 
> The only place I've found this error message is in an unanswered
> stackoverflow question and in the source code here:
> https://github.com/ceph/ceph/blob/master/src/tools/cephfs/JournalTool.cc#L114
> 
> It looks like that is trying to read a filesystem map (fsmap), which
> might be corrupted.  Running `rados export` prints part of the help text
> and then segfaults, which is rather concerning.  This is 100% repeatable
> (outside of gdb, details below).  I tried `rados df` and that worked
> fine, so it's not all rados commands which are having this problem.
> However, I tried `rados bench 60 seq` and that also printed out the
> usage text and then segfaulted.
> 
> 
> 
> 
> 
> Info on the `rados export` crash:
> rados export
> usage: rados [options] [commands]
> POOL COMMANDS
> <snip>
> IMPORT AND EXPORT
>    export [filename]
>        Serialize pool contents to a file or standard out.
> <snip>
> OMAP OPTIONS:
>     --omap-key-file file            read the omap key from a file
> *** Caught signal (Segmentation fault) **
>  in thread 7fcb6bfff700 thread_name:fn_anonymous
> 
> When running it in gdb:
> (gdb) bt
> #0  0x00007fffef07331f in std::_Rb_tree<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> >,
> std::pair<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const, std::map<int, boost::variant<boost::blank,
> std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> >, unsigned long, long, double, bool,
> entity_addr_t, std::chrono::duration<long, std::ratio<1l, 1l> >,
> Option::size_t, uuid_d>, std::less<int>, std::allocator<std::pair<int
> const, boost::variant<boost::blank, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> >, unsigned long, long,
> double, bool, entity_addr_t, std::chrono::duration<long, std::ratio<1l,
> 1l> >, Option::size_t, uuid_d> > > > >,
> std::_Select1st<std::pair<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const, std::map<int,
> boost::variant<boost::blank, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> >, unsigned long, long,
> double, bool, entity_addr_t, std::chrono::duration<long, std::ratio<1l,
> 1l> >, Option::size_t, uuid_d>, std::less<int>,
> std::allocator<std::pair<int const, boost::variant<boost::blank,
> std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> >, unsigned long, long, double, bool,
> entity_addr_t, std::chrono::duration<long, std::ratio<1l, 1l> >,
> Option::size_t, uuid_d> > > > > >,
> std::less<std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > >,
> std::allocator<std::pair<std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> > const, std::map<int,
> boost::variant<boost::blank, std::__cxx11::basic_string<char,
> std::char_traits<char>, std::allocator<char> >, unsigned long, long,
> double, bool, entity_addr_t, std::chrono::duration<long, std::ratio<1l,
> 1l> >, Option::size_t, uuid_d>, std::less<int>,
> std::allocator<std::pair<int const, boost::variant<boost::blank,
> std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> >, unsigned long, long, double, bool,
> entity_addr_t, std::chrono::duration<long, std::ratio<1l, 1l> >,
> Option::size_t, uuid_d> > > > > >
>> ::find(std::__cxx11::basic_string<char, std::char_traits<char>,
> std::allocator<char> > const&) const () from
> /usr/lib/ceph/libceph-common.so.0
> Backtrace stopped: Cannot access memory at address 0x7fffd9ff89f8
> 
Aug 13 01:10:03 xe ceph-mds[2492820]:  -1315> 2019-08-13 01:10:02.652 7f92db16d700 -1 log_channel(cluster) log [ERR] : bad backtrace on directory inode 0x1000006ac48
Aug 13 01:10:03 xe ceph-mds[2492820]:  -1315> 2019-08-13 01:10:02.660 7f92db16d700 -1 log_channel(cluster) log [ERR] : bad backtrace on directory inode 0x10000989b62
Aug 13 01:10:03 xe ceph-mds[2492820]:  -1315> 2019-08-13 01:10:02.660 7f92db16d700 -1 log_channel(cluster) log [ERR] : bad backtrace on directory inode 0x10000989c72
Aug 13 01:10:03 xe ceph-mds[2492820]:  -1315> 2019-08-13 01:10:03.648 7f92dc970700 -1 /build/ceph-13.2.6/src/mds/Server.cc: In function 'void Server::_rename_prepare(MDRequestRef&, EMetaBlob*, ceph::bufferlist*, CDentry*, CDentry*, CDentr
Aug 13 01:10:03 xe ceph-mds[2492820]: /build/ceph-13.2.6/src/mds/Server.cc: 8036: FAILED assert(oldin->first <= straydn->first)
Aug 13 01:10:03 xe ceph-mds[2492820]:  ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
Aug 13 01:10:03 xe ceph-mds[2492820]:  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14e) [0x7f92e9dc9b5e]
Aug 13 01:10:03 xe ceph-mds[2492820]:  2: (()+0x2c4cb7) [0x7f92e9dc9cb7]
Aug 13 01:10:03 xe ceph-mds[2492820]:  3: (Server::_rename_prepare(boost::intrusive_ptr<MDRequestImpl>&, EMetaBlob*, ceph::buffer::list*, CDentry*, CDentry*, CDentry*)+0x2ed8) [0x55591e3068a8]
Aug 13 01:10:03 xe ceph-mds[2492820]:  4: (Server::handle_client_rename(boost::intrusive_ptr<MDRequestImpl>&)+0x2e14) [0x55591e309764]
Aug 13 01:10:03 xe ceph-mds[2492820]:  5: (Server::handle_client_request(MClientRequest*)+0x272) [0x55591e325302]
Aug 13 01:10:03 xe ceph-mds[2492820]:  6: (Server::dispatch(Message*)+0x179) [0x55591e328e49]
Aug 13 01:10:03 xe ceph-mds[2492820]:  7: (MDSRank::handle_deferrable_message(Message*)+0x6dc) [0x55591e29d56c]
Aug 13 01:10:03 xe ceph-mds[2492820]:  8: (MDSRank::_dispatch(Message*, bool)+0x61a) [0x55591e2b3e4a]
Aug 13 01:10:03 xe ceph-mds[2492820]:  9: (MDSRank::retry_dispatch(Message*)+0x12) [0x55591e2b4782]
Aug 13 01:10:03 xe ceph-mds[2492820]:  10: (MDSInternalContextBase::complete(int)+0x6b) [0x55591e4f70eb]
Aug 13 01:10:03 xe ceph-mds[2492820]:  11: (MDSRank::_advance_queues()+0xec) [0x55591e2b30fc]
Aug 13 01:10:03 xe ceph-mds[2492820]:  12: (MDSRank::ProgressThread::entry()+0x3d) [0x55591e2b370d]
Aug 13 01:10:03 xe ceph-mds[2492820]:  13: (()+0x76db) [0x7f92e967d6db]
Aug 13 01:10:03 xe ceph-mds[2492820]:  14: (clone()+0x3f) [0x7f92e886388f]
Aug 13 01:10:03 xe ceph-mds[2492820]:  -1315> 2019-08-13 01:10:03.652 7f92dc970700 -1 *** Caught signal (Aborted) **
Aug 13 01:10:03 xe ceph-mds[2492820]:  in thread 7f92dc970700 thread_name:mds_rank_progr
Aug 13 01:10:03 xe ceph-mds[2492820]:  ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
Aug 13 01:10:03 xe ceph-mds[2492820]:  1: (()+0x12890) [0x7f92e9688890]
Aug 13 01:10:03 xe ceph-mds[2492820]:  2: (gsignal()+0xc7) [0x7f92e8780e97]
Aug 13 01:10:03 xe ceph-mds[2492820]:  3: (abort()+0x141) [0x7f92e8782801]
Aug 13 01:10:03 xe ceph-mds[2492820]:  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x25f) [0x7f92e9dc9c6f]
Aug 13 01:10:03 xe ceph-mds[2492820]:  5: (()+0x2c4cb7) [0x7f92e9dc9cb7]
Aug 13 01:10:03 xe ceph-mds[2492820]:  6: (Server::_rename_prepare(boost::intrusive_ptr<MDRequestImpl>&, EMetaBlob*, ceph::buffer::list*, CDentry*, CDentry*, CDentry*)+0x2ed8) [0x55591e3068a8]
Aug 13 01:10:03 xe ceph-mds[2492820]:  7: (Server::handle_client_rename(boost::intrusive_ptr<MDRequestImpl>&)+0x2e14) [0x55591e309764]
Aug 13 01:10:03 xe ceph-mds[2492820]:  8: (Server::handle_client_request(MClientRequest*)+0x272) [0x55591e325302]
Aug 13 01:10:03 xe ceph-mds[2492820]:  9: (Server::dispatch(Message*)+0x179) [0x55591e328e49]
Aug 13 01:10:03 xe ceph-mds[2492820]:  10: (MDSRank::handle_deferrable_message(Message*)+0x6dc) [0x55591e29d56c]
Aug 13 01:10:03 xe ceph-mds[2492820]:  11: (MDSRank::_dispatch(Message*, bool)+0x61a) [0x55591e2b3e4a]
Aug 13 01:10:03 xe ceph-mds[2492820]:  12: (MDSRank::retry_dispatch(Message*)+0x12) [0x55591e2b4782]
Aug 13 01:10:03 xe ceph-mds[2492820]:  13: (MDSInternalContextBase::complete(int)+0x6b) [0x55591e4f70eb]
Aug 13 01:10:03 xe ceph-mds[2492820]:  14: (MDSRank::_advance_queues()+0xec) [0x55591e2b30fc]
Aug 13 01:10:03 xe ceph-mds[2492820]:  15: (MDSRank::ProgressThread::entry()+0x3d) [0x55591e2b370d]
Aug 13 01:10:03 xe ceph-mds[2492820]:  16: (()+0x76db) [0x7f92e967d6db]
Aug 13 01:10:03 xe ceph-mds[2492820]:  17: (clone()+0x3f) [0x7f92e886388f]
Aug 13 01:10:03 xe ceph-mds[2492820]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com