Re: nautilus cluster down by loss of 2 mons

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Could someone also explain the logics behind the decision to dump so much data to the disk. Especially in container environments with resource limits this is not really nice. 

> -----Original Message-----
> Sent: Tuesday, 31 August 2021 19:16
> To: ceph-users@xxxxxxx
> Subject:  nautilus cluster down by loss of 2 mons
> 
> Hi
> 
> We have a nautilus cluster that was plagued by a network failure. One of
> the monitors fell out of quorum
> Once the network settled down and all osds were back online again we got
> that mon synchronizing
> 
> However the filesystem suddenly exploded in a minute or so from 63G
> usage to 93G  resulting in 100% usage
> At that point we decided to remove that mon from the cluster and hope to
> compact the database on the remaining mons so that
> we could add a new mon while there was less synchronizing to do because
> of the smaller database size
> 
> Unfortunately the tell osd compact command made the database on mon nr 2
> grow very fast resulting in another full filesystem
> hence a dead cluster
> 
> Can anyone advice towards the fastest recovery in this situation?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux