Re: CEPH complete cluster failure: unknown PGS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Do your current Crush rules for your pools apply to the new OSD map
with those 4 nodes? If you have e.g. ec 4+2 in 8 node cluster and now
you have 4 nodes you went less than your min size, please check

Στις Πέμ 28 Σεπ 2023 στις 9:24 μ.μ., ο/η <v1tnam@xxxxxxxxx> έγραψε:
>
> I have an 8-node cluster with old hardware. a week ago 4 nodes went down and the CEPH cluster went nuts.
> All pgs became unknown and montors took too long to be in sync.
> So i reduced the number of mons to one and mgrs to one as well
>
> Now the recovery starts with 100% unknown pgs and then pgs start to move ot inactive . It generally fails to recover in the middle and starts from scratch.
>
> It's hold hardware and OSDs have lots of slow ops and probably number of bad sectors as well
>
> Any suggestions on how to tackle this. It's a nautilus cluster and pretty old (8-year old hardware)
>
> Thanks
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux