Re: [Ceph Upgrade] - Rollback Support during Upgrade failure

Matthew Vernon <mvernon@xxxxxxxxxxxxx> · Wed, 8 Sep 2021 14:33:14 +0100

Hi,

On 06/09/2021 08:37, Lokendra Rathour wrote:
Thanks, Mathew for the Update.
The upgrade got failed for some random wired reasons, Checking further 
Ceph's status shows that "Ceph health is OK" and times it gives certain 
warnings but I think that is ok.

OK...

but what if we see the Version mismatch between the daemons, i.e few 
services have upgraded and the remaining could not be upgraded. So in 
this state, we do two things:

  * Retrying the upgrade activity (to Pacific) - it might work this time.
  * Going back to the older Version (Octopus) - is this possible and if
    yes then how?

In general downgrades are not supported, so I think continuing with the 
upgrade is the best answer.

*Other Query:*
What if the complete cluster goes down, i.e mon crashes other daemon 
crashes, can we try to restore the data in OSDs, maybe by reusing the 
OSD's in another or new Ceph Cluster or something to save the data.

You will generally have more than 1 mon (typically 3, some people have 
5), and as long as a quorum remains, you will still have a working 
cluster. If you somehow manage to break all your mons, there is an 
emergency procedure for recreating the mon map from an OSD -

https://docs.ceph.com/en/pacific/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds

...but you don't want to end up in that situation!

RADOS typically splits objects across multiple placement groups (and 
thus across multiple OSDs); while there are tools to extract data from 
OSDs (e.g. https://docs.ceph.com/en/latest/man/8/ceph-objectstore-tool/ 
), you won't get complete objects this way. Instead, the advice would be 
to try and get enough mons back up to get your cluster at least to a 
read-only state and then attempt recovery that way.

HTH,

Matthew
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx