Re: Disaster recovery of monitor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The problem is that I think I don't have any good monitor anymore.
How do I know if the map I am trying is ok?

I also saw in the logs that the primary mon was trying to contact a removed mon at IP .112 .. So, I added .112 again ... and it didn't help.

Attached are the logs of what is going on and some monmaps that I capture that were from minutes before the cluster become inaccessible ..

Should I try inject this monmaps in my primary mon to see if it can recover the cluster?
Is it possible to see if this monmaps match my content?

Thanks a lot.
Jose Tavares





On Mon, Nov 16, 2015 at 10:48 PM, Nathan Harper <nathan.harper@xxxxxxxxxxx> wrote:
I had to go through a similar process when we had a disaster which destroyed one of our monitors.   I followed the process here: REMOVING MONITORS FROM AN UNHEALTHY CLUSTER to remove all but one monitor, which let me bring the cluster back up.  

As you are running an older version of Ceph than hammer, some of the commands might differ (perhaps this might help http://docs.ceph.com/docs/v0.80/rados/operations/add-or-rm-mons/)


--
Nathan Harper // IT Systems Architect

e: nathan.harper@xxxxxxxxxxx // t: 0117 906 1104 // m: 07875 510891 // w: www.cfms.org.uk // Linkedin grey icon scaled
CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR
 
CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd
CFMS Services Ltd registered office // Victoria House // 51 Victoria Street // Bristol // BS1 6AD

On 16 November 2015 at 16:50, Jose Tavares <jat@xxxxxxxxxxxx> wrote:
Hi guys ...
I need some help as my cluster seems to be corrupted.

I saw here .. 
.. a msg from 2013 where Peter had a problem with his monitors.

I had the same problem today when trying to add a new monitor, and than playing with monmap as the monitors were not entering the quorum. I'm using version 0.80.8.

Right now my cluster won't start because of a corrupted monitor. Is it possible to remove all monitors and create just a new one without losing data? I have ~260GB of data with work from 2 weeks.

What should I do? Do you recommend any specific procedure?

Thanks a lot.
Jose Tavares

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



Attachment: monmap_files.tar.gz
Description: GNU Zip compressed data

Attachment: ceph-mon.log.gz
Description: GNU Zip compressed data

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux