I am planning a Luminous to Nautilus upgrade. The instructions state
(very terse version):
- install Nautilus ceph packages
- restart MONs
- restart MGRs
- restart OSDs
We have OSDs running on our MON hosts (essentially all our ceph hosts
are the same chassis). So, if everything goes properly we simply restart
the MONs on the hosts with them after adding the Nautlus packages and
then go back and restart the OSDs. Where is the problem?
I'm wondering about the situation where you are part way through
restarting your MONs (or MGRs) and one of the hosts reboots (or perhaps
a single OSD on one of the MON hosts crashes and is restarted). I.e you
now has one (or more) OSDs running Nautilus before you've finished
restarting all the MONs. I've tested this briefly and it looks like the
OSD rejoins the cluster, but space utilization for it goes crazy thereafter.
So my question is: if this happens, what is the recommended remedial
action? Is destroying the impacted OSDs the only option?
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx