On 6/19/24 11:15, Dietmar Rieder wrote:
On 6/19/24 10:30, Xiubo Li wrote:On 6/19/24 16:13, Dietmar Rieder wrote:Hi Xiubo,[...]0> 2024-06-19T07:12:39.236+0000 7f90fa912700 -1 *** Caught signal (Aborted) **in thread 7f90fa912700 thread_name:md_log_replayceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)1: /lib64/libpthread.so.0(+0x12d20) [0x7f910b4d2d20] 2: gsignal() 3: abort()4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x18f) [0x7f910c722e6f]5: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f910c722fdb]6: (interval_set<inodeno_t, std::map>::erase(inodeno_t, inodeno_t, std::function<bool (inodeno_t, inodeno_t)>)+0x2e5) [0x55a93c0de9a5] 7: (EMetaBlob::replay(MDSRank*, LogSegment*, int, MDPeerUpdate*)+0x4207) [0x55a93c3e76e7]8: (EUpdate::replay(MDSRank*)+0x61) [0x55a93c3e9f81] 9: (MDLog::_replay_thread()+0x6c9) [0x55a93c3701d9] 10: (MDLog::ReplayThread::entry()+0x11) [0x55a93c01e2d1] 11: /lib64/libpthread.so.0(+0x81ca) [0x7f910b4c81ca] 12: clone()NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.This is a known bug, please see https://tracker.ceph.com/issues/61009.As a workaround I am afraid you need to trim the journal logs first and then try to restart the MDS daemons, And at the same time please follow the workaround in https://tracker.ceph.com/issues/61009#note-26I see, I'll try to do this. Are there any caveats or issues to expect by trimming the journal logs?Certainly you will lose the dirty metadata in the journals.Is there a step by step guide on how to perform the trimming? Should all MDS be stopped before?Please follow https://docs.ceph.com/en/nautilus/cephfs/disaster-recovery-experts/#disaster-recovery-experts.OK, when I run the cephfs-journal-tool I get an error: # cephfs-journal-tool journal export backup.bin Error ((22) Invalid argument)My cluster is managed by caphadm, so (in my stress situation) I'm not able find the correct way to use cephfs-journal-toolI'm sure it is something stupid that I'm missing but I'd be happy for any hint.
I ran the disaster recovery procedures now, as follows:[root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:0 event recover_dentries summary
Events by type: OPEN: 8737 PURGED: 1 SESSION: 9 SESSIONS: 2 SUBTREEMAP: 128 TABLECLIENT: 2 TABLESERVER: 30 UPDATE: 9207 Errors: 0[root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:1 event recover_dentries summary
Events by type: OPEN: 3 SESSION: 1 SUBTREEMAP: 34 UPDATE: 32965 Errors: 0[root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:2 event recover_dentries summary
Events by type: OPEN: 5289 SESSION: 10 SESSIONS: 3 SUBTREEMAP: 128 UPDATE: 76448 Errors: 0 [root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:all journal inspect Overall journal integrity: OK Overall journal integrity: DAMAGED Corrupt regions: 0xd9a84f243c-ffffffffffffffff Overall journal integrity: OK [root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:0 journal inspect Overall journal integrity: OK [root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:1 journal inspect Overall journal integrity: DAMAGED Corrupt regions: 0xd9a84f243c-ffffffffffffffff [root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:2 journal inspect Overall journal integrity: OK [root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:0 journal reset old journal was 879331755046~508520587 new journal start will be 879843344384 (3068751 bytes past old end) writing journal head writing EResetJournal entry done [root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:1 journal reset old journal was 934711229813~120432327 new journal start will be 934834864128 (3201988 bytes past old end) writing journal head writing EResetJournal entry done [root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:2 journal reset old journal was 1334153584288~252692691 new journal start will be 1334409428992 (3152013 bytes past old end) writing journal head writing EResetJournal entry done [root@ceph01-b /]# cephfs-table-tool all reset session { "0": { "data": {}, "result": 0 }, "1": { "data": {}, "result": 0 }, "2": { "data": {}, "result": 0 } } [root@ceph01-b /]# cephfs-journal-tool --rank=cephfs:1 journal inspect Overall journal integrity: OK [root@ceph01-b /]# ceph fs reset cephfs --yes-i-really-mean-it But now I hit the error below:-20> 2024-06-19T11:13:00.610+0000 7ff3694d0700 10 monclient: _send_mon_message to mon.cephmon-03 at v2:10.1.3.23:3300/0 -19> 2024-06-19T11:13:00.637+0000 7ff3664ca700 2 mds.0.cache Memory usage: total 485928, rss 170860, heap 207156, baseline 182580, 0 / 33434 inodes have caps, 0 caps, 0 caps per inode -18> 2024-06-19T11:13:00.787+0000 7ff36a4d2700 1 mds.default.cephmon-03.chjusj Updating MDS map to version 8061 from mon.1 -17> 2024-06-19T11:13:00.787+0000 7ff36a4d2700 1 mds.0.8058 handle_mds_map i am now mds.0.8058 -16> 2024-06-19T11:13:00.787+0000 7ff36a4d2700 1 mds.0.8058 handle_mds_map state change up:rejoin --> up:active -15> 2024-06-19T11:13:00.787+0000 7ff36a4d2700 1 mds.0.8058 recovery_done -- successful recovery! -14> 2024-06-19T11:13:00.788+0000 7ff36a4d2700 1 mds.0.8058 active_start -13> 2024-06-19T11:13:00.789+0000 7ff36dcd9700 5 mds.beacon.default.cephmon-03.chjusj received beacon reply up:active seq 4 rtt 0.955007 -12> 2024-06-19T11:13:00.790+0000 7ff36a4d2700 1 mds.0.8058 cluster recovered. -11> 2024-06-19T11:13:00.790+0000 7ff36a4d2700 4 mds.0.8058 set_osd_epoch_barrier: epoch=33596 -10> 2024-06-19T11:13:00.790+0000 7ff3634c4700 5 mds.0.log _submit_thread 879843344432~2609 : EUpdate check_inode_max_size [metablob 0x100, 2 dirs] -9> 2024-06-19T11:13:00.791+0000 7ff3644c6700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/mds/MDCache.cc: In function 'void MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)' thread 7ff3644c6700 time 2024-06-19T11:13:00.791580+0000 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/mds/MDCache.cc: 1660: FAILED ceph_assert(follows >= realm->get_newest_seq())
ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7ff374ad3e15]
2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7ff374ad3fdb]3: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)+0x13c7) [0x55da0a7aa227] 4: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*, snapid_t)+0xc5) [0x55da0a7aa3a5] 5: (Locker::check_inode_max_size(CInode*, bool, unsigned long, unsigned long, utime_t)+0x84d) [0x55da0a88ce3d] 6: (RecoveryQueue::_recovered(CInode*, int, unsigned long, utime_t)+0x4f0) [0x55da0a85ad50]
7: (MDSContext::complete(int)+0x5f) [0x55da0a9ddeef] 8: (MDSIOContextBase::complete(int)+0x524) [0x55da0a9de674] 9: (Filer::C_Probe::finish(int)+0xbb) [0x55da0aa9dc9b] 10: (Context::complete(int)+0xd) [0x55da0a6775fd] 11: (Finisher::finisher_thread_entry()+0x18d) [0x7ff374b77abd] 12: /lib64/libpthread.so.0(+0x81ca) [0x7ff3738791ca] 13: clone()-8> 2024-06-19T11:13:00.792+0000 7ff36a4d2700 10 log_client handle_log_ack log(last 7) v1 -7> 2024-06-19T11:13:00.792+0000 7ff36a4d2700 10 log_client logged 2024-06-19T11:12:59.647346+0000 mds.default.cephmon-03.chjusj (mds.0) 1 : cluster [ERR] loaded dup inode 0x10003e45d99 [415,head] v61632 at /home/balaz/.bash_history-54696.tmp, but inode 0x10003e45d99.head v61639 already exists at /home/balaz/.bash_history -6> 2024-06-19T11:13:00.792+0000 7ff36a4d2700 10 log_client logged 2024-06-19T11:12:59.648139+0000 mds.default.cephmon-03.chjusj (mds.0) 2 : cluster [ERR] loaded dup inode 0x10003e45d7c [415,head] v253612 at /home/rieder/.bash_history-10215.tmp, but inode 0x10003e45d7c.head v253630 already exists at /home/rieder/.bash_history -5> 2024-06-19T11:13:00.792+0000 7ff36a4d2700 10 log_client logged 2024-06-19T11:12:59.649483+0000 mds.default.cephmon-03.chjusj (mds.0) 3 : cluster [ERR] loaded dup inode 0x10003e45d83 [415,head] v164103 at /home/gottschling/.bash_history-44802.tmp, but inode 0x10003e45d83.head v164112 already exists at /home/gottschling/.bash_history -4> 2024-06-19T11:13:00.792+0000 7ff36a4d2700 10 log_client logged 2024-06-19T11:12:59.656221+0000 mds.default.cephmon-03.chjusj (mds.0) 4 : cluster [ERR] bad backtrace on directory inode 0x10003e42340 -3> 2024-06-19T11:13:00.792+0000 7ff36a4d2700 10 log_client logged 2024-06-19T11:12:59.737282+0000 mds.default.cephmon-03.chjusj (mds.0) 5 : cluster [ERR] bad backtrace on directory inode 0x10003e45d8b -2> 2024-06-19T11:13:00.792+0000 7ff36a4d2700 10 log_client logged 2024-06-19T11:12:59.804984+0000 mds.default.cephmon-03.chjusj (mds.0) 6 : cluster [ERR] bad backtrace on directory inode 0x10003e45d9f -1> 2024-06-19T11:13:00.792+0000 7ff36a4d2700 10 log_client logged 2024-06-19T11:12:59.805078+0000 mds.default.cephmon-03.chjusj (mds.0) 7 : cluster [ERR] bad backtrace on directory inode 0x10003e45d90 0> 2024-06-19T11:13:00.792+0000 7ff3644c6700 -1 *** Caught signal (Aborted) **
in thread 7ff3644c6700 thread_name:MR_Finisherceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)
1: /lib64/libpthread.so.0(+0x12d20) [0x7ff373883d20] 2: gsignal() 3: abort()4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x18f) [0x7ff374ad3e6f]
5: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7ff374ad3fdb]6: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)+0x13c7) [0x55da0a7aa227] 7: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*, snapid_t)+0xc5) [0x55da0a7aa3a5] 8: (Locker::check_inode_max_size(CInode*, bool, unsigned long, unsigned long, utime_t)+0x84d) [0x55da0a88ce3d] 9: (RecoveryQueue::_recovered(CInode*, int, unsigned long, utime_t)+0x4f0) [0x55da0a85ad50]
10: (MDSContext::complete(int)+0x5f) [0x55da0a9ddeef] 11: (MDSIOContextBase::complete(int)+0x524) [0x55da0a9de674] 12: (Filer::C_Probe::finish(int)+0xbb) [0x55da0aa9dc9b] 13: (Context::complete(int)+0xd) [0x55da0a6775fd] 14: (Finisher::finisher_thread_entry()+0x18d) [0x7ff374b77abd] 15: /lib64/libpthread.so.0(+0x81ca) [0x7ff3738791ca] 16: clone()NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_mirror 0/ 5 rbd_replay 0/ 5 rbd_pwl 0/ 5 journaler 0/ 5 objectcacher 0/ 5 immutable_obj_cache 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 0/ 0 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 1 reserver 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 rgw_sync 1/ 5 rgw_datacache 1/ 5 rgw_access 1/ 5 rgw_dbstore 1/ 5 rgw_flight 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 compressor 1/ 5 bluestore 1/ 5 bluefs 1/ 3 bdev 1/ 5 kstore 4/ 5 rocksdb 4/ 5 leveldb 1/ 5 fuse 2/ 5 mgr 1/ 5 mgrc 1/ 5 dpdk 1/ 5 eventtrace 1/ 5 prioritycache 0/ 5 test 0/ 5 cephfs_mirror 0/ 5 cephsqlite 0/ 5 seastore 0/ 5 seastore_onode 0/ 5 seastore_odata 0/ 5 seastore_omap 0/ 5 seastore_tm 0/ 5 seastore_t 0/ 5 seastore_cleaner 0/ 5 seastore_epm 0/ 5 seastore_lba 0/ 5 seastore_fixedkv_tree 0/ 5 seastore_cache 0/ 5 seastore_journal 0/ 5 seastore_device 0/ 5 seastore_backref 0/ 5 alienstore 1/ 5 mclock 0/ 5 cyanstore 1/ 5 ceph_exporter 1/ 5 memstore -2/-2 (syslog threshold) -1/-1 (stderr threshold) --- pthread ID / name mapping for recent threads --- 7ff362cc3700 / 7ff3634c4700 / md_submit 7ff363cc5700 / 7ff3644c6700 / MR_Finisher 7ff3654c8700 / PQ_Finisher 7ff365cc9700 / mds_rank_progr 7ff3664ca700 / ms_dispatch 7ff3684ce700 / ceph-mds 7ff3694d0700 / safe_timer 7ff36a4d2700 / ms_dispatch 7ff36b4d4700 / io_context_pool 7ff36c4d6700 / admin_socket 7ff36ccd7700 / msgr-worker-2 7ff36d4d8700 / msgr-worker-1 7ff36dcd9700 / msgr-worker-0 7ff375c9bb00 / ceph-mds max_recent 10000 max_new 1000 log_file /var/log/ceph/ceph-mds.default.cephmon-03.chjusj.log --- end dump of recent events --- Any idea? Thanks Dietmar > [...]
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx