Re: Disaster recovery of monitor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/17/2015 04:56 AM, Jose Tavares wrote:
> The problem is that I think I don't have any good monitor anymore.
> How do I know if the map I am trying is ok?
> 

How do you mean there is no good monitor? Did you encounter a disk
failure or something?

> I also saw in the logs that the primary mon was trying to contact a
> removed mon at IP .112 .. So, I added .112 again ... and it didn't help.
> 

"Added" again? You started that monitor?

> Attached are the logs of what is going on and some monmaps that I
> capture that were from minutes before the cluster become inaccessible ..
> 

Isn't there a huge timedrift somewhere? Failing cephx authorization can
also point at a huge timedrift on the clients and OSDs. Are you sure the
time is correct?

> Should I try inject this monmaps in my primary mon to see if it can
> recover the cluster?
> Is it possible to see if this monmaps match my content?
> 

The monmaps probably didn't change that much. But a good Monitor also
has the PGMaps, OSDMaps, etc. You need a lot more then just a monmap.

But check the time first on those machines.

Wido

> Thanks a lot.
> Jose Tavares
> 
> 
> 
> 
> 
> On Mon, Nov 16, 2015 at 10:48 PM, Nathan Harper
> <nathan.harper@xxxxxxxxxxx <mailto:nathan.harper@xxxxxxxxxxx>> wrote:
> 
>     I had to go through a similar process when we had a disaster which
>     destroyed one of our monitors.   I followed the process here:
>     REMOVING MONITORS FROM AN UNHEALTHY CLUSTER
>     <http://docs.ceph.com/docs/hammer/rados/operations/add-or-rm-mons/> to
>     remove all but one monitor, which let me bring the cluster back up.  
> 
>     As you are running an older version of Ceph than hammer, some of the
>     commands might differ (perhaps this might
>     help http://docs.ceph.com/docs/v0.80/rados/operations/add-or-rm-mons/)
> 
> 
>     -- 
>     *Nathan Harper*// IT Systems Architect
> 
>     *e: * nathan.harper@xxxxxxxxxxx <mailto:nathan.harper@xxxxxxxxxxx>
>     // *t: * 0117 906 1104 // *m: * 07875 510891 // *w: *
>     www.cfms.org.uk <http://www.cfms.org.uk%22> // Linkedin grey icon
>     scaled <http://uk.linkedin.com/pub/nathan-harper/21/696/b81>
>     CFMS Services Ltd// Bristol & Bath Science Park // Dirac Crescent //
>     Emersons Green // Bristol // BS16 7FR
>      
">     CFMS Services Ltd is registered in England and Wales No 05742022 - a
>     subsidiary of CFMS Ltd
>     CFMS Services Ltd registered office // Victoria House // 51 Victoria
>     Street // Bristol // BS1 6AD
> 
>     On 16 November 2015 at 16:50, Jose Tavares <jat@xxxxxxxxxxxx
>     <mailto:jat@xxxxxxxxxxxx>> wrote:
> 
>         Hi guys ...
>         I need some help as my cluster seems to be corrupted.
> 
>         I saw here .. 
>         https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg01919.html
>         .. a msg from 2013 where Peter had a problem with his monitors.
> 
>         I had the same problem today when trying to add a new monitor,
>         and than playing with monmap as the monitors were not entering
>         the quorum. I'm using version 0.80.8.
> 
>         Right now my cluster won't start because of a corrupted monitor.
>         Is it possible to remove all monitors and create just a new one
>         without losing data? I have ~260GB of data with work from 2 weeks.
> 
>         What should I do? Do you recommend any specific procedure?
> 
>         Thanks a lot.
>         Jose Tavares
> 
>         _______________________________________________
>         ceph-users mailing list
>         ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux