On Wed, 29 May 2024, Eugen Block wrote: > I'm not really sure either, what about this? > > ceph mds repaired <rank> I think it works only for 'damaged' MDs. N. > The docs state: > > >Mark the file system rank as repaired. Unlike the name suggests, this command > >does not change a MDS; it manipulates the file system rank which has been > >marked damaged. > > Maybe that could bring it back up? Did you set max_mds to 1 at some point? If > you do it now (and you currently have only one active MDS), maybe that would > clean up the failed rank as well? > > > Zitat von "Noe P." <ml@am-rand.berlin>: > > >Hi, > > > >after our desaster yesterday, it seems that we got our MONs back. > >One of the filesystems, however, seems in a strange state: > > > > % ceph fs status > > > > .... > > fs_cluster - 782 clients > > ========== > > RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS > > 0 active cephmd6a Reqs: 5 /s 13.2M 13.2M 1425k 51.4k > > 1 failed > > POOL TYPE USED AVAIL > > fs_cluster_meta metadata 3594G 53.5T > > fs_cluster_data data 421T 53.5T > > .... > > STANDBY MDS > > cephmd6b > > cephmd4b > > MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) > > quincy (stable) > > > > > > % ceph fs dump > > .... > > Filesystem 'fs_cluster' (3) > > fs_name fs_cluster > > epoch 3068261 > > flags 12 joinable allow_snaps allow_multimds_snaps > > created 2022-08-26T15:55:07.186477+0200 > > modified 2024-05-29T12:43:30.606431+0200 > > tableserver 0 > > root 0 > > session_timeout 60 > > session_autoclose 300 > > max_file_size 4398046511104 > > required_client_features {} > > last_failure 0 > > last_failure_osd_epoch 1777109 > > compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable > > ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds > > uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline > > data,8=no anchor table,9=file layout v2,10=snaprealm v2} > > max_mds 2 > > in 0,1 > > up {0=911794623} > > failed > > damaged > > stopped 2,3 > > data_pools [32] > > metadata_pool 33 > > inline_data disabled > > balancer > > standby_count_wanted 1 > > [mds.cephmd6a{0:911794623} state up:active seq 44701 addr > > [v2:10.13.5.6:6800/189084355,v1:10.13.5.6:6801/189084355] compat > > {c=[1],r=[1],i=[7ff]}] > > > > > >We would like to get rid of the failed rank 1 (without crashing the MONs) > >and have a 2nd MD from the standbys step in . > > > >Anyone have an idea how to do this ? > >I'm a bit reluctant to try 'ceph mds rmfailed', as this seems to have > >triggered the MONs to crash. > > > >Regards, > > Noe > >_______________________________________________ > >ceph-users mailing list -- ceph-users@xxxxxxx > >To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx