Hi,
this is another example why min_size 1/size 2 are a bad choice (if you
value your data). There have been plenty discussions on this list
about that, I'm not going into detail about that. I'm not familiar
with rook, but activating existing OSDs usually works fine [1].
Regards,
Eugen
[1] https://docs.ceph.com/en/reef/cephadm/services/osd/#activate-existing-osds
Zitat von Matthew Booth <mbooth@xxxxxxxxxx>:
I have a 3 node ceph cluster in my home lab. One of the pools spans 3
hdds, one on each node, and has size 2, min size 1. One of my nodes is
currently down, and I have 160 pgs in 'unknown' state. The other 2
hosts are up and the cluster has quorum.
Example `ceph health detail` output:
pg 9.0 is stuck inactive for 25h, current state unknown, last acting []
I have 3 questions:
Why would the pgs be in an unknown state?
I would like to recover the cluster without recovering the failed
node, primarily so that I know I can. Is that possible?
The boot nvme of the host has failed, so I will most likely rebuild
it. I'm running rook, and I will most likely delete the old node and
create a new one with the same name. AFAIK, the OSDs are fine. When
rook rediscovers the OSDs, will it add them back with data intact? If
not, is there any way I can make it so it will?
Thanks!
--
Matthew Booth
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx