On 6/19/24 16:13, Dietmar Rieder wrote:
Hi Xiubo,
[...]
0> 2024-06-19T07:12:39.236+0000 7f90fa912700 -1 *** Caught
signal (Aborted) **
in thread 7f90fa912700 thread_name:md_log_replay
ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef
(stable)
1: /lib64/libpthread.so.0(+0x12d20) [0x7f910b4d2d20]
2: gsignal()
3: abort()
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x18f) [0x7f910c722e6f]
5: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f910c722fdb]
6: (interval_set<inodeno_t, std::map>::erase(inodeno_t, inodeno_t,
std::function<bool (inodeno_t, inodeno_t)>)+0x2e5) [0x55a93c0de9a5]
7: (EMetaBlob::replay(MDSRank*, LogSegment*, int,
MDPeerUpdate*)+0x4207) [0x55a93c3e76e7]
8: (EUpdate::replay(MDSRank*)+0x61) [0x55a93c3e9f81]
9: (MDLog::_replay_thread()+0x6c9) [0x55a93c3701d9]
10: (MDLog::ReplayThread::entry()+0x11) [0x55a93c01e2d1]
11: /lib64/libpthread.so.0(+0x81ca) [0x7f910b4c81ca]
12: clone()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
This is a known bug, please see https://tracker.ceph.com/issues/61009.
As a workaround I am afraid you need to trim the journal logs first
and then try to restart the MDS daemons, And at the same time please
follow the workaround in https://tracker.ceph.com/issues/61009#note-26
I see, I'll try to do this. Are there any caveats or issues to expect
by trimming the journal logs?
Certainly you will lose the dirty metadata in the journals.
Is there a step by step guide on how to perform the trimming? Should
all MDS be stopped before?
Please follow
https://docs.ceph.com/en/nautilus/cephfs/disaster-recovery-experts/#disaster-recovery-experts.
Sorry for the lot of (naive) questions, but I do not want to make any
mistake here.
Since the journal logs were corrupted and couldn't be replayed by the
MDS when starting and the MDS crash will continue unless you manually
repair or truncate it.
Thanks
- Xiubo
Thanks for your support,
Dietmar
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 rbd_pwl
0/ 5 journaler
0/ 5 objectcacher
0/ 5 immutable_obj_cache
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/ 5 rgw_datacache
1/ 5 rgw_access
1/ 5 rgw_dbstore
1/ 5 rgw_flight
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
1/ 5 fuse
2/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
0/ 5 test
0/ 5 cephfs_mirror
0/ 5 cephsqlite
0/ 5 seastore
0/ 5 seastore_onode
0/ 5 seastore_odata
0/ 5 seastore_omap
0/ 5 seastore_tm
0/ 5 seastore_t
0/ 5 seastore_cleaner
0/ 5 seastore_epm
0/ 5 seastore_lba
0/ 5 seastore_fixedkv_tree
0/ 5 seastore_cache
0/ 5 seastore_journal
0/ 5 seastore_device
0/ 5 seastore_backref
0/ 5 alienstore
1/ 5 mclock
0/ 5 cyanstore
1/ 5 ceph_exporter
1/ 5 memstore
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
7f90fa912700 / md_log_replay
7f90fb914700 /
7f90fc115700 / MR_Finisher
7f90fd117700 / PQ_Finisher
7f90fe119700 / ms_dispatch
7f910011d700 / ceph-mds
7f9102121700 / ms_dispatch
7f9103123700 / io_context_pool
7f9104125700 / admin_socket
7f9104926700 / msgr-worker-2
7f9105127700 / msgr-worker-1
7f9105928700 / msgr-worker-0
7f910d8eab00 / ceph-mds
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-mds.default.cephmon-02.duujba.log
--- end dump of recent events ---
I have no idea how to resolve this and would be grateful for any help.
Dietmar
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx