Hi, we ran into a bigger problem today with our ceph cluster (Quincy, Alma8.9). We have 4 filesystems and a total of 6 MDs, the largest fs having two ranks assigned (i.e. one standby). Since we often have the problem of MDs lagging behind, we restart the MDs occasionally. Helps ususally, the standby taking over. Today however, the restart didn't work and the rank 1 MDs started to crash for unclear reasons. Rank 0 seemed ok. We decided at some point to go back to one rank by settings max_mds to 1. Due to the permanent crashing, the rank1 didn't stop however, and at some point we set it to failed and the fs not joinable. At this point it looked like this: fs_cluster - 716 clients ========== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephmd6a Reqs: 0 /s 13.1M 13.1M 1419k 79.2k 1 failed POOL TYPE USED AVAIL fs_cluster_meta metadata 1791G 54.2T fs_cluster_data data 421T 54.2T with rank1 still being listed. The next attempt was to remove that failed ceph mds rmfailed fs_cluster:1 --yes-i-really-mean-it which, after a short while brought down 3 out of five MONs. They keep crashing shortly after restart with stack traces like this: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) 1: /lib64/libpthread.so.0(+0x12cf0) [0x7ff8813adcf0] 2: gsignal() 3: abort() 4: /lib64/libstdc++.so.6(+0x9009b) [0x7ff8809bf09b] 5: /lib64/libstdc++.so.6(+0x9654c) [0x7ff8809c554c] 6: /lib64/libstdc++.so.6(+0x965a7) [0x7ff8809c55a7] 7: /lib64/libstdc++.so.6(+0x96808) [0x7ff8809c5808] 8: /lib64/libstdc++.so.6(+0x92045) [0x7ff8809c1045] 9: (MDSMonitor::maybe_resize_cluster(FSMap&, int)+0xa9e) [0x55f05d9a5e8e] 10: (MDSMonitor::tick()+0x18a) [0x55f05d9b18da] 11: (MDSMonitor::on_active()+0x2c) [0x55f05d99a17c] 12: (Context::complete(int)+0xd) [0x55f05d76c56d] 13: (void finish_contexts<std::__cxx11::list<Context*, std::allocator<Context*> > >(ceph::common::CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x9d) [0x55f05 d799d7d] 14: (Paxos::finish_round()+0x74) [0x55f05d8c5c24] 15: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x41b) [0x55f05d8c7e5b] 16: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x123e) [0x55f05d76a2ae] 17: (Monitor::_ms_dispatch(Message*)+0x406) [0x55f05d76a976] 18: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x5d) [0x55f05d79b3ed] 19: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x478) [0x7ff88367fed8] 20: (DispatchQueue::entry()+0x50f) [0x7ff88367d31f] 21: (DispatchQueue::DispatchThread::entry()+0x11) [0x7ff883747381] 22: /lib64/libpthread.so.0(+0x81ca) [0x7ff8813a31ca] 23: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. The MDSMonitor::maybe_resize_cluster somehow suggests a connection to the above MDs operation. Does anyone have an idea how to get this cluster back together again ? Like manually fixing the MD ranks ? 'fs get' can still be called in short moments where enough MONs are reachable: fs_name fs_cluster epoch 3065486 flags 13 allow_snaps allow_multimds_snaps created 2022-08-26T15:55:07.186477+0200 modified 2024-05-28T12:21:59.294364+0200 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 4398046511104 required_client_features {} last_failure 0 last_failure_osd_epoch 1777109 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0,1 up {0=911794623} failed damaged stopped 2,3 data_pools [32] metadata_pool 33 inline_data disabled balancer standby_count_wanted 1 [mds.cephmd6a{0:911794623} state up:active seq 22777 addr [v2:10.13.5.6:6800/189084355,v1:10.13.5.6:6801/189084355] compat {c=[1],r=[1],i=[7ff]}] Regards, Noe _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx