Hi all, We are seeing this several times. Some of our MDS stop reporting stats for no obvious reason. And a rolling restart of all MDS in question could resolve this. But restarting active MDS could cause downtime up to several minutes, we don’t want to do this constantly. Client count, MDS version info are also missing from “ceph fs status” and web dashboard. Prometheus metrics are also affected. But “ceph tell mds.cephfs.gpu018.ovxvoz session ls” reports correct client sessions. Also, the new "cephfs-top" does not work for us, It only shows a lot of N/A. I don't know if it is related. Apart from these, the actual metadata operations seem to work fine. How can I identify the root cause? Is this a known bug? Thanks, Weiwen Hu $ ceph fs status cephfs - 0 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs.gpu018.ovxvoz Reqs: 0 /s 0 0 0 0 1 active cephfs.gpu006.ddpekw Reqs: 0 /s 0 0 0 0 1-s standby-replay cephfs.gpu023.aetiph Evts: 0 /s 0 0 0 0 0-s standby-replay cephfs.gpu024.rpfbnh Evts: 69 /s 2242k 2242k 11.5k 0 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 127G 523G cephfs.cephfs.data data 74.6T 15.8T cephfs.cephfs.data_ssd data 0 785G cephfs.cephfs.data_mixed data 8768G 523G VERSION DAEMONS None cephfs.gpu018.ovxvoz, cephfs.gpu006.ddpekw, cephfs.gpu023.aetiph ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable) cephfs.gpu024.rpfbnh Note a lot of “0”, and 3 of the MDS are missing version info _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx