On Wed, Jan 12, 2022 at 01:24:20PM +0530, Venky Shankar wrote: > It would be interesting to see what "mds1" was doing around the > "01:53:07" timestamp. Could you gather that from the mds log? Nothing special: debug 2022-01-08T17:48:03.493+0000 7f3c30d91700 1 mds.cephfs.gpu024.rpfbnh Updating MDS map to version 86134 from mon.4 debug 2022-01-08T17:48:22.029+0000 7f3c30d91700 1 mds.cephfs.gpu024.rpfbnh Updating MDS map to version 86135 from mon.4 debug 2022-01-08T17:48:40.154+0000 7f3c30d91700 1 mds.cephfs.gpu024.rpfbnh Updating MDS map to version 86136 from mon.4 debug 2022-01-08T18:01:56.084+0000 7f3c30d91700 1 mds.cephfs.gpu024.rpfbnh Updating MDS map to version 86137 from mon.4 debug 2022-01-08T18:01:59.784+0000 7f3c30d91700 1 mds.cephfs.gpu024.rpfbnh Updating MDS map to version 86138 from mon.4 debug 2022-01-08T18:08:15.788+0000 7f3c30d91700 1 mds.cephfs.gpu024.rpfbnh Updating MDS map to version 86139 from mon.4 (01:53:07 in kernel log is in timezone +08:00) I've checked MDSMap e86137, mds.cephfs.gpu024.rpfbnh is active in rank 1 And I just checked again, the first OSD_FULL appears in the log at 'Jan 09 01:52:43' or '2022-01-08T17:52:43.726+0000'. So the mdsc_handle_reply dmesg can be related to OSD_FULL. I was misleaded by the prometheus query: ceph_osd_stat_bytes_used / ceph_osd_stat_bytes which only reports about 79% to that full OSD at the time when OSD_FULL appears in cluster log. _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx