Hi, after our desaster yesterday, it seems that we got our MONs back. One of the filesystems, however, seems in a strange state: % ceph fs status .... fs_cluster - 782 clients ========== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephmd6a Reqs: 5 /s 13.2M 13.2M 1425k 51.4k 1 failed POOL TYPE USED AVAIL fs_cluster_meta metadata 3594G 53.5T fs_cluster_data data 421T 53.5T .... STANDBY MDS cephmd6b cephmd4b MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) % ceph fs dump .... Filesystem 'fs_cluster' (3) fs_name fs_cluster epoch 3068261 flags 12 joinable allow_snaps allow_multimds_snaps created 2022-08-26T15:55:07.186477+0200 modified 2024-05-29T12:43:30.606431+0200 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 4398046511104 required_client_features {} last_failure 0 last_failure_osd_epoch 1777109 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 2 in 0,1 up {0=911794623} failed damaged stopped 2,3 data_pools [32] metadata_pool 33 inline_data disabled balancer standby_count_wanted 1 [mds.cephmd6a{0:911794623} state up:active seq 44701 addr [v2:10.13.5.6:6800/189084355,v1:10.13.5.6:6801/189084355] compat {c=[1],r=[1],i=[7ff]}] We would like to get rid of the failed rank 1 (without crashing the MONs) and have a 2nd MD from the standbys step in . Anyone have an idea how to do this ? I'm a bit reluctant to try 'ceph mds rmfailed', as this seems to have triggered the MONs to crash. Regards, Noe _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx