Hi
I have a weird problem with my ceph cluster:
basic info:
- 3-node cluster
- cephfs runs on three data pools:
- cephfs_meta (replicated)
- ec_basic (erasure coded)
- ec_sensitive (erasure coded with higher redundancy)
My MDS keeps crashing with a bad backtrace error:
2022-02-21T16:11:09.661+0100 7fd2cd290700 -1 log_channel(cluster) log
[ERR] : bad backtrace on directory inode 0x10002000f5d
So far so good. To my best understanding these metadata errors should be
fixed by following the disaster recovery procedure described here:
https://docs.ceph.com/en/nautilus/cephfs/disaster-recovery-experts/
However, the weird part is: the error remains unchanged. Even directly
after resetting, i.e. before recreating metadata objects, the error does
not change.
Is there something else that i need to reset?
I have already tried to delete the corrupt inode via rmomapkey, i.e.
rados -p cephfs_meta listomapkeys 10002000f5d.00000000 returns empty
Any suggestions on how to proceed? Any hints are appreciated!
MDS Log:
--------------------------
Feb 21 16:11:07 herta systemd[1]: Started Ceph metadata server daemon.
Feb 21 16:11:07 herta ceph-mds[128287]: starting mds.herta at
Feb 21 16:11:09 herta ceph-mds[128287]: 2022-02-21T16:11:09.661+0100
7fd2cd290700 -1 log_channel(cluster) log [ERR] : bad backtrace on
directory inode 0x10002000f5d
Feb 21 16:11:10 herta ceph-mds[128287]: ./src/mds/CInode.cc: In function
'CDir* CInode::get_or_open_dirfrag(MDCache*, frag_t)' thread
7fd2cd290700 time 2022-02-21T16:11:10.629363+0100
Feb 21 16:11:10 herta ceph-mds[128287]: ./src/mds/CInode.cc: 785: FAILED
ceph_assert(is_dir())
Feb 21 16:11:10 herta ceph-mds[128287]: ceph version 16.2.7
(f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
Feb 21 16:11:10 herta ceph-mds[128287]: 1:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x124) [0x7fd2d876e046]
Feb 21 16:11:10 herta ceph-mds[128287]: 2:
/usr/lib/ceph/libceph-common.so.2(+0x2511d1) [0x7fd2d876e1d1]
Feb 21 16:11:10 herta ceph-mds[128287]: 3:
(CInode::get_or_open_dirfrag(MDCache*, frag_t)+0x105) [0x557ec94be365]
Feb 21 16:11:10 herta ceph-mds[128287]: 4:
(OpenFileTable::_prefetch_dirfrags()+0x2ad) [0x557ec956645d]
Feb 21 16:11:10 herta ceph-mds[128287]: 5:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 6: (void
finish_contexts<std::vector<MDSContext*, std::allocator<MDSContext*> >
>(ceph::common::CephContext*, std::vector<MDSContext*,
std::allocator<MDSContext*> >&, int)+0x98) [0x557ec920dd58]
Feb 21 16:11:10 herta ceph-mds[128287]: 7:
(MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&,
int)+0x138) [0x557ec935bfc8]
Feb 21 16:11:10 herta ceph-mds[128287]: 8:
(MDCache::_open_ino_backtrace_fetched(inodeno_t,
ceph::buffer::v15_2_0::list&, int)+0x277) [0x557ec9363717]
Feb 21 16:11:10 herta ceph-mds[128287]: 9:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 10:
(MDSIOContextBase::complete(int)+0x524) [0x557ec95380f4]
Feb 21 16:11:10 herta ceph-mds[128287]: 11:
(Finisher::finisher_thread_entry()+0x18d) [0x7fd2d880bc0d]
Feb 21 16:11:10 herta ceph-mds[128287]: 12:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7fd2d84c9ea7]
Feb 21 16:11:10 herta ceph-mds[128287]: 13: clone()
Feb 21 16:11:10 herta ceph-mds[128287]: *** Caught signal (Aborted) **
Feb 21 16:11:10 herta ceph-mds[128287]: in thread 7fd2cd290700
thread_name:MR_Finisher
Feb 21 16:11:10 herta ceph-mds[128287]: 2022-02-21T16:11:10.625+0100
7fd2cd290700 -1 ./src/mds/CInode.cc: In function 'CDir*
CInode::get_or_open_dirfrag(MDCache*, frag_t)' thread 7fd2cd290700 time
2022-02-21T16:11:10.629363+0100
Feb 21 16:11:10 herta ceph-mds[128287]: ./src/mds/CInode.cc: 785: FAILED
ceph_assert(is_dir())
Feb 21 16:11:10 herta ceph-mds[128287]: ceph version 16.2.7
(f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
Feb 21 16:11:10 herta ceph-mds[128287]: 1:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x124) [0x7fd2d876e046]
Feb 21 16:11:10 herta ceph-mds[128287]: 2:
/usr/lib/ceph/libceph-common.so.2(+0x2511d1) [0x7fd2d876e1d1]
Feb 21 16:11:10 herta ceph-mds[128287]: 3:
(CInode::get_or_open_dirfrag(MDCache*, frag_t)+0x105) [0x557ec94be365]
Feb 21 16:11:10 herta ceph-mds[128287]: 4:
(OpenFileTable::_prefetch_dirfrags()+0x2ad) [0x557ec956645d]
Feb 21 16:11:10 herta ceph-mds[128287]: 5:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 6: (void
finish_contexts<std::vector<MDSContext*, std::allocator<MDSContext*> >
>(ceph::common::CephContext*, std::vector<MDSContext*,
std::allocator<MDSContext*> >&, int)+0x98) [0x557ec920dd58]
Feb 21 16:11:10 herta ceph-mds[128287]: 7:
(MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&,
int)+0x138) [0x557ec935bfc8]
Feb 21 16:11:10 herta ceph-mds[128287]: 8:
(MDCache::_open_ino_backtrace_fetched(inodeno_t,
ceph::buffer::v15_2_0::list&, int)+0x277) [0x557ec9363717]
Feb 21 16:11:10 herta ceph-mds[128287]: 9:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 10:
(MDSIOContextBase::complete(int)+0x524) [0x557ec95380f4]
Feb 21 16:11:10 herta ceph-mds[128287]: 11:
(Finisher::finisher_thread_entry()+0x18d) [0x7fd2d880bc0d]
Feb 21 16:11:10 herta ceph-mds[128287]: 12:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7fd2d84c9ea7]
Feb 21 16:11:10 herta ceph-mds[128287]: 13: clone()
Feb 21 16:11:10 herta ceph-mds[128287]: ceph version 16.2.7
(f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
Feb 21 16:11:10 herta ceph-mds[128287]: 1:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7fd2d84d5140]
Feb 21 16:11:10 herta ceph-mds[128287]: 2: gsignal()
Feb 21 16:11:10 herta ceph-mds[128287]: 3: abort()
Feb 21 16:11:10 herta ceph-mds[128287]: 4:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x16e) [0x7fd2d876e090]
Feb 21 16:11:10 herta ceph-mds[128287]: 5:
/usr/lib/ceph/libceph-common.so.2(+0x2511d1) [0x7fd2d876e1d1]
Feb 21 16:11:10 herta ceph-mds[128287]: 6:
(CInode::get_or_open_dirfrag(MDCache*, frag_t)+0x105) [0x557ec94be365]
Feb 21 16:11:10 herta ceph-mds[128287]: 7:
(OpenFileTable::_prefetch_dirfrags()+0x2ad) [0x557ec956645d]
Feb 21 16:11:10 herta ceph-mds[128287]: 8:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 9: (void
finish_contexts<std::vector<MDSContext*, std::allocator<MDSContext*> >
>(ceph::common::CephContext*, std::vector<MDSContext*,
std::allocator<MDSContext*> >&, int)+0x98) [0x557ec920dd58]
Feb 21 16:11:10 herta ceph-mds[128287]: 10:
(MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&,
int)+0x138) [0x557ec935bfc8]
Feb 21 16:11:10 herta ceph-mds[128287]: 11:
(MDCache::_open_ino_backtrace_fetched(inodeno_t,
ceph::buffer::v15_2_0::list&, int)+0x277) [0x557ec9363717]
Feb 21 16:11:10 herta ceph-mds[128287]: 12:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 13:
(MDSIOContextBase::complete(int)+0x524) [0x557ec95380f4]
Feb 21 16:11:10 herta ceph-mds[128287]: 14:
(Finisher::finisher_thread_entry()+0x18d) [0x7fd2d880bc0d]
Feb 21 16:11:10 herta ceph-mds[128287]: 15:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7fd2d84c9ea7]
Feb 21 16:11:10 herta ceph-mds[128287]: 16: clone()
Feb 21 16:11:10 herta ceph-mds[128287]: 2022-02-21T16:11:10.629+0100
7fd2cd290700 -1 *** Caught signal (Aborted) **
Feb 21 16:11:10 herta ceph-mds[128287]: in thread 7fd2cd290700
thread_name:MR_Finisher
Feb 21 16:11:10 herta ceph-mds[128287]: ceph version 16.2.7
(f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
Feb 21 16:11:10 herta ceph-mds[128287]: 1:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7fd2d84d5140]
Feb 21 16:11:10 herta ceph-mds[128287]: 2: gsignal()
Feb 21 16:11:10 herta ceph-mds[128287]: 3: abort()
Feb 21 16:11:10 herta ceph-mds[128287]: 4:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x16e) [0x7fd2d876e090]
Feb 21 16:11:10 herta ceph-mds[128287]: 5:
/usr/lib/ceph/libceph-common.so.2(+0x2511d1) [0x7fd2d876e1d1]
Feb 21 16:11:10 herta ceph-mds[128287]: 6:
(CInode::get_or_open_dirfrag(MDCache*, frag_t)+0x105) [0x557ec94be365]
Feb 21 16:11:10 herta ceph-mds[128287]: 7:
(OpenFileTable::_prefetch_dirfrags()+0x2ad) [0x557ec956645d]
Feb 21 16:11:10 herta ceph-mds[128287]: 8:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 9: (void
finish_contexts<std::vector<MDSContext*, std::allocator<MDSContext*> >
>(ceph::common::CephContext*, std::vector<MDSContext*,
std::allocator<MDSContext*> >&, int)+0x98) [0x557ec920dd58]
Feb 21 16:11:10 herta ceph-mds[128287]: 10:
(MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&,
int)+0x138) [0x557ec935bfc8]
Feb 21 16:11:10 herta ceph-mds[128287]: 11:
(MDCache::_open_ino_backtrace_fetched(inodeno_t,
ceph::buffer::v15_2_0::list&, int)+0x277) [0x557ec9363717]
Feb 21 16:11:10 herta ceph-mds[128287]: 12:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 13:
(MDSIOContextBase::complete(int)+0x524) [0x557ec95380f4]
Feb 21 16:11:10 herta ceph-mds[128287]: 14:
(Finisher::finisher_thread_entry()+0x18d) [0x7fd2d880bc0d]
Feb 21 16:11:10 herta ceph-mds[128287]: 15:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7fd2d84c9ea7]
Feb 21 16:11:10 herta ceph-mds[128287]: 16: clone()
Feb 21 16:11:10 herta ceph-mds[128287]: NOTE: a copy of the executable,
or `objdump -rdS <executable>` is needed to interpret this.
Feb 21 16:11:10 herta ceph-mds[128287]: -1430>
2022-02-21T16:11:09.661+0100 7fd2cd290700 -1 log_channel(cluster) log
[ERR] : bad backtrace on directory inode 0x10002000f5d
Feb 21 16:11:10 herta ceph-mds[128287]: -1429>
2022-02-21T16:11:10.625+0100 7fd2cd290700 -1 ./src/mds/CInode.cc: In
function 'CDir* CInode::get_or_open_dirfrag(MDCache*, frag_t)' thread
7fd2cd290700 time 2022-02-21T16:11:10.629363+0100
Feb 21 16:11:10 herta ceph-mds[128287]: ./src/mds/CInode.cc: 785: FAILED
ceph_assert(is_dir())
Feb 21 16:11:10 herta ceph-mds[128287]: ceph version 16.2.7
(f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
Feb 21 16:11:10 herta ceph-mds[128287]: 1:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x124) [0x7fd2d876e046]
Feb 21 16:11:10 herta ceph-mds[128287]: 2:
/usr/lib/ceph/libceph-common.so.2(+0x2511d1) [0x7fd2d876e1d1]
Feb 21 16:11:10 herta ceph-mds[128287]: 3:
(CInode::get_or_open_dirfrag(MDCache*, frag_t)+0x105) [0x557ec94be365]
Feb 21 16:11:10 herta ceph-mds[128287]: 4:
(OpenFileTable::_prefetch_dirfrags()+0x2ad) [0x557ec956645d]
Feb 21 16:11:10 herta ceph-mds[128287]: 5:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 6: (void
finish_contexts<std::vector<MDSContext*, std::allocator<MDSContext*> >
>(ceph::common::CephContext*, std::vector<MDSContext*,
std::allocator<MDSContext*> >&, int)+0x98) [0x557ec920dd58]
Feb 21 16:11:10 herta ceph-mds[128287]: 7:
(MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&,
int)+0x138) [0x557ec935bfc8]
Feb 21 16:11:10 herta ceph-mds[128287]: 8:
(MDCache::_open_ino_backtrace_fetched(inodeno_t,
ceph::buffer::v15_2_0::list&, int)+0x277) [0x557ec9363717]
Feb 21 16:11:10 herta ceph-mds[128287]: 9:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 10:
(MDSIOContextBase::complete(int)+0x524) [0x557ec95380f4]
Feb 21 16:11:10 herta ceph-mds[128287]: 11:
(Finisher::finisher_thread_entry()+0x18d) [0x7fd2d880bc0d]
Feb 21 16:11:10 herta ceph-mds[128287]: 12:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7fd2d84c9ea7]
Feb 21 16:11:10 herta ceph-mds[128287]: 13: clone()
Feb 21 16:11:10 herta ceph-mds[128287]: -1428>
2022-02-21T16:11:10.629+0100 7fd2cd290700 -1 *** Caught signal (Aborted) **
Feb 21 16:11:10 herta ceph-mds[128287]: in thread 7fd2cd290700
thread_name:MR_Finisher
Feb 21 16:11:10 herta ceph-mds[128287]: ceph version 16.2.7
(f9aa029788115b5df5eeee328f584156565ee5b7) pacific (stable)
Feb 21 16:11:10 herta ceph-mds[128287]: 1:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7fd2d84d5140]
Feb 21 16:11:10 herta ceph-mds[128287]: 2: gsignal()
Feb 21 16:11:10 herta ceph-mds[128287]: 3: abort()
Feb 21 16:11:10 herta ceph-mds[128287]: 4:
(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x16e) [0x7fd2d876e090]
Feb 21 16:11:10 herta ceph-mds[128287]: 5:
/usr/lib/ceph/libceph-common.so.2(+0x2511d1) [0x7fd2d876e1d1]
Feb 21 16:11:10 herta ceph-mds[128287]: 6:
(CInode::get_or_open_dirfrag(MDCache*, frag_t)+0x105) [0x557ec94be365]
Feb 21 16:11:10 herta ceph-mds[128287]: 7:
(OpenFileTable::_prefetch_dirfrags()+0x2ad) [0x557ec956645d]
Feb 21 16:11:10 herta ceph-mds[128287]: 8:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 9: (void
finish_contexts<std::vector<MDSContext*, std::allocator<MDSContext*> >
>(ceph::common::CephContext*, std::vector<MDSContext*,
std::allocator<MDSContext*> >&, int)+0x98) [0x557ec920dd58]
Feb 21 16:11:10 herta ceph-mds[128287]: 10:
(MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&,
int)+0x138) [0x557ec935bfc8]
Feb 21 16:11:10 herta ceph-mds[128287]: 11:
(MDCache::_open_ino_backtrace_fetched(inodeno_t,
ceph::buffer::v15_2_0::list&, int)+0x277) [0x557ec9363717]
Feb 21 16:11:10 herta ceph-mds[128287]: 12:
(MDSContext::complete(int)+0x50) [0x557ec9537980]
Feb 21 16:11:10 herta ceph-mds[128287]: 13:
(MDSIOContextBase::complete(int)+0x524) [0x557ec95380f4]
Feb 21 16:11:10 herta ceph-mds[128287]: 14:
(Finisher::finisher_thread_entry()+0x18d) [0x7fd2d880bc0d]
Feb 21 16:11:10 herta ceph-mds[128287]: 15:
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7fd2d84c9ea7]
Feb 21 16:11:10 herta ceph-mds[128287]: 16: clone()
Feb 21 16:11:10 herta ceph-mds[128287]: NOTE: a copy of the executable,
or `objdump -rdS <executable>` is needed to interpret this.
--------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx