All MDS's Crashed, Failed Assert

m@xxxxxxxxxxxx · Wed, 24 Jul 2024 19:02:16 -0000

I'm looking for guidance around how to recover after all MDS continue to crash with a failed assert during journal replay (no MON damage).

Context:

So I've been working through failed MDS for the past day, likely caused by a large snaptrim operation that caused the cluster to grind to a halt.

After evicting all clients and restarting the MDS's (it appears the clients were overwhelming the MDS's). The MDS are failing to start with:

debug     -1> 2024-07-24T18:44:52.674+0000 7f7878c22700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/osdc/Journaler.cc: In function 'bool Journaler::try_read_entry(ceph::bufferlist&)' thread 7f7878c22700 time 2024-07-24T18:44:52.676027+0000
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.2/rpm/el8/BUILD/ceph-18.2.2/src/osdc/Journaler.cc: 1256: FAILED ceph_assert(start_ptr == read_pos)
 ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f788aa32e15]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f788aa32fdb]
 3: (Journaler::try_read_entry(ceph::buffer::v15_2_0::list&)+0x132) [0x55555847ef32]
 4: (MDLog::_replay_thread()+0xda) [0x555558436bea]
 5: (MDLog::ReplayThread::entry()+0x11) [0x5555580e52d1]
 6: /lib64/libpthread.so.0(+0x81ca) [0x7f78897d81ca]
 7: clone()
debug      0> 2024-07-24T18:44:52.674+0000 7f7878c22700 -1 *** Caught signal (Aborted) **
 in thread 7f7878c22700 thread_name:md_log_replay
 ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)
 1: /lib64/libpthread.so.0(+0x12d20) [0x7f78897e2d20]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x18f) [0x7f788aa32e6f]
 5: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f788aa32fdb]
 6: (Journaler::try_read_entry(ceph::buffer::v15_2_0::list&)+0x132) [0x55555847ef32]
 7: (MDLog::_replay_thread()+0xda) [0x555558436bea]
 8: (MDLog::ReplayThread::entry()+0x11) [0x5555580e52d1]
 9: /lib64/libpthread.so.0(+0x81ca) [0x7f78897d81ca]
 10: clone()

Normally, there's three MDS are deployed, 1 active, one on hot standby. The cluster seems to believe that any restarted MDS is attempting to replay, but systemd reports an immediate crashed with a SIGABRT.

ceph mds stat
cephfs:1/1 {0=cephfs.sm1.esxjag=up:replay(laggy or crashed)}

Redeploying the MDS's also continue to crash (suggesting a bad journal?)
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx