Hello, I've been having problems with my MDS and they got stuck in up:reply state The journal was ok and everything seemed ok, so I reset the journal and now all MDS fail to start with the following error: 2022-05-18 12:27:40.092 7f8748561700 -1 /home/abuild/rpmbuild/BUILD/ceph-14.2.16-402-g7d47dbaf4d/src/mds/PurgeQueue.cc: In function 'void PurgeQueue::_recover()' thread 7f8748561700 time 2022-05-18 12:27:40.094406 /home/abuild/rpmbuild/BUILD/ceph-14.2.16-402-g7d47dbaf4d/src/mds/PurgeQueue.cc: 286: FAILED ceph_assert(readable) ceph version 14.2.16-402-g7d47dbaf4d (7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f8756ca91a6] 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f8756ca9381] 3: (PurgeQueue::_recover()+0x4ad) [0x55f1666ad26d] 4: (()+0x2b0353) [0x55f1666ad353] 5: (FunctionContext::finish(int)+0x2c) [0x55f16652b63c] 6: (Context::complete(int)+0x9) [0x55f166529339] 7: (Finisher::finisher_thread_entry()+0x15e) [0x7f8756cf231e] 8: (()+0x84f9) [0x7f87565a64f9] 9: (clone()+0x3f) [0x7f87557affbf] 2022-05-18 12:27:40.092 7f8748561700 -1 *** Caught signal (Aborted) ** in thread 7f8748561700 thread_name:PQ_Finisher ceph version 14.2.16-402-g7d47dbaf4d (7d47dbaf4d0960a2e910628360ae36def84ed913) nautilus (stable) Its a production cluster, so it's fairly urgent Salu2! -- Miguel Armas CanaryTek Consultoria y Sistemas SL http://www.canarytek.com/ _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx