Could someone also explain the logics behind the decision to dump so much data to the disk. Especially in container environments with resource limits this is not really nice. > -----Original Message----- > Sent: Tuesday, 31 August 2021 19:16 > To: ceph-users@xxxxxxx > Subject: nautilus cluster down by loss of 2 mons > > Hi > > We have a nautilus cluster that was plagued by a network failure. One of > the monitors fell out of quorum > Once the network settled down and all osds were back online again we got > that mon synchronizing > > However the filesystem suddenly exploded in a minute or so from 63G > usage to 93G resulting in 100% usage > At that point we decided to remove that mon from the cluster and hope to > compact the database on the remaining mons so that > we could add a new mon while there was less synchronizing to do because > of the smaller database size > > Unfortunately the tell osd compact command made the database on mon nr 2 > grow very fast resulting in another full filesystem > hence a dead cluster > > Can anyone advice towards the fastest recovery in this situation? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx