Thank you very much. But the mds still don’t go active. While trying to resolve this, I run: ceph mds rmfailed 0 --yes-i-really-mean-it ceph mds rmfailed 1 --yes-i-really-mean-it Then 3 out of 5 MONs crashed. I was able to keep MON up by making MDSMonitor::maybe_resize_cluster return false directly with gdb. Then I set max_mds back to 2. Now my MONs does not crash. I’ve really learnt a lesson from this.. Now I suppose I need to figure out how to undo the “mds rmfailed” command? Current “ceph fs dump”: (note 7 is added to “incompat”, “max_mds” is 2, “failed” is cleared) e41448 enable_multiple, ever_enabled_multiple: 0,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 2 Filesystem 'cephfs' (2) fs_name cephfs epoch 41442 flags 12 created 2020-09-15T04:10:53.585782+0000 modified 2021-09-17T17:51:57.582372+0000 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 43315 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 2 in 0,1 up {} failed damaged stopped data_pools [5,13,16] metadata_pool 4 inline_data disabled balancer standby_count_wanted 1 Standby daemons: [mds.cephfs.gpu024.rpfbnh{-1:7918294} state up:standby seq 1 join_fscid=2 addr [v2:202.38.247.187:6800/94739959,v1:202.38.247.187:6801/94739959] compat {c=[1],r=[1],i=[7ff]}] dumped fsmap epoch 41448 发件人: Patrick Donnelly<mailto:pdonnell@xxxxxxxxxx> 发送时间: 2021年9月18日 0:24 收件人: 胡 玮文<mailto:huww98@xxxxxxxxxxx> 抄送: Eric Dold<mailto:dold.eric@xxxxxxxxx>; ceph-users<mailto:ceph-users@xxxxxxx> 主题: Re: Re: Cephfs - MDS all up:standby, not becoming up:active On Fri, Sep 17, 2021 at 11:11 AM 胡 玮文 <huww98@xxxxxxxxxxx> wrote: > > We are experiencing the same when upgrading to 16.2.6 with cephadm. > > > > I tried > > > > ceph fs set cephfs max_mds 1 > > ceph fs set cephfs allow_standby_replay false > > > > , but still all MDS goes to standby. It seems all ranks are marked failed. Do we have a way to clear this flag? > [...] > compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} Please run: ceph fs compat add_incompat cephfs 7 "mds uses inline data" -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx