Hi, I got involved in a case where a Nautilus cluster was experiencing MDSes asserting showing the backtrace mentioned in this ticket: https://tracker.ceph.com/issues/36349 ceph_assert(follows >= realm->get_newest_seq()); In the end we needed to use these tooling to get one MDS running again: https://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#using-an-alternate-metadata-pool-for-recovery The root-cause seems to be that this Nautilus cluster was running Multi-MDS with a very high amount of CephFS snapshots. After a couple of days of scanning (scan_links seems single threaded!) we finally got a single MDS running again with a usable CephFS filesystem. At the moment chowns() are running to get all the permissions set back to what they should be. The question now outstanding: Is it safe to enable Multi-MDS again on a CephFS filesystem which still has these many snapshots and is running single at the moment? New snapshots are disabled at the moment, so those won't be created. In addition: How safe is it to remove snapshots? As this will result in metadata updates. Thanks Wido _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx