Do your current Crush rules for your pools apply to the new OSD map with those 4 nodes? If you have e.g. ec 4+2 in 8 node cluster and now you have 4 nodes you went less than your min size, please check Στις Πέμ 28 Σεπ 2023 στις 9:24 μ.μ., ο/η <v1tnam@xxxxxxxxx> έγραψε: > > I have an 8-node cluster with old hardware. a week ago 4 nodes went down and the CEPH cluster went nuts. > All pgs became unknown and montors took too long to be in sync. > So i reduced the number of mons to one and mgrs to one as well > > Now the recovery starts with 100% unknown pgs and then pgs start to move ot inactive . It generally fails to recover in the middle and starts from scratch. > > It's hold hardware and OSDs have lots of slow ops and probably number of bad sectors as well > > Any suggestions on how to tackle this. It's a nautilus cluster and pretty old (8-year old hardware) > > Thanks > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx