I'm looking for guidance around how to recover after all MDS continue to crash with a failed assert during journal replay (no MON damage). Context: So I've been working through failed MDS for the past day, likely caused by a large snaptrim operation that caused the cluster to grind to a halt. After evicting all clients and restarting the MDS's (it appears the clients were overwhelming the MDS's). The MDS are failing to start with: debug -1> 2024-07-24T18:44:52.674+0000 7f7878c22700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/osdc/Journaler.cc: In function 'bool Journaler::try_read_entry(ceph::bufferlist&)' thread 7f7878c22700 time 2024-07-24T18:44:52.676027+0000 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/osdc/Journaler.cc: 1256: FAILED ceph_assert(start_ptr == read_pos) ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f788aa32e15] 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f788aa32fdb] 3: (Journaler::try_read_entry(ceph::buffer::v15_2_0::list&)+0x132) [0x55555847ef32] 4: (MDLog::_replay_thread()+0xda) [0x555558436bea] 5: (MDLog::ReplayThread::entry()+0x11) [0x5555580e52d1] 6: /lib64/libpthread.so.0(+0x81ca) [0x7f78897d81ca] 7: clone() debug 0> 2024-07-24T18:44:52.674+0000 7f7878c22700 -1 *** Caught signal (Aborted) ** in thread 7f7878c22700 thread_name:md_log_replay ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable) 1: /lib64/libpthread.so.0(+0x12d20) [0x7f78897e2d20] 2: gsignal() 3: abort() 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x18f) [0x7f788aa32e6f] 5: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f788aa32fdb] 6: (Journaler::try_read_entry(ceph::buffer::v15_2_0::list&)+0x132) [0x55555847ef32] 7: (MDLog::_replay_thread()+0xda) [0x555558436bea] 8: (MDLog::ReplayThread::entry()+0x11) [0x5555580e52d1] 9: /lib64/libpthread.so.0(+0x81ca) [0x7f78897d81ca] 10: clone() Normally, there's three MDS are deployed, 1 active, one on hot standby. The cluster seems to believe that any restarted MDS is attempting to replay, but systemd reports an immediate crashed with a SIGABRT. ceph mds stat cephfs:1/1 {0=cephfs.sm1.esxjag=up:replay(laggy or crashed)} Redeploying the MDS's also continue to crash (suggesting a bad journal?) _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx