On 11/17/2015 04:56 AM, Jose Tavares wrote: > The problem is that I think I don't have any good monitor anymore. > How do I know if the map I am trying is ok? > How do you mean there is no good monitor? Did you encounter a disk failure or something? > I also saw in the logs that the primary mon was trying to contact a > removed mon at IP .112 .. So, I added .112 again ... and it didn't help. > "Added" again? You started that monitor? > Attached are the logs of what is going on and some monmaps that I > capture that were from minutes before the cluster become inaccessible .. > Isn't there a huge timedrift somewhere? Failing cephx authorization can also point at a huge timedrift on the clients and OSDs. Are you sure the time is correct? > Should I try inject this monmaps in my primary mon to see if it can > recover the cluster? > Is it possible to see if this monmaps match my content? > The monmaps probably didn't change that much. But a good Monitor also has the PGMaps, OSDMaps, etc. You need a lot more then just a monmap. But check the time first on those machines. Wido > Thanks a lot. > Jose Tavares > > > > > > On Mon, Nov 16, 2015 at 10:48 PM, Nathan Harper > <nathan.harper@xxxxxxxxxxx <mailto:nathan.harper@xxxxxxxxxxx>> wrote: > > I had to go through a similar process when we had a disaster which > destroyed one of our monitors. I followed the process here: > REMOVING MONITORS FROM AN UNHEALTHY CLUSTER > <http://docs.ceph.com/docs/hammer/rados/operations/add-or-rm-mons/> to > remove all but one monitor, which let me bring the cluster back up. > > As you are running an older version of Ceph than hammer, some of the > commands might differ (perhaps this might > help http://docs.ceph.com/docs/v0.80/rados/operations/add-or-rm-mons/) > > > -- > *Nathan Harper*// IT Systems Architect > > *e: * nathan.harper@xxxxxxxxxxx <mailto:nathan.harper@xxxxxxxxxxx> > // *t: * 0117 906 1104 // *m: * 07875 510891 // *w: * > www.cfms.org.uk <http://www.cfms.org.uk%22> // Linkedin grey icon > scaled <http://uk.linkedin.com/pub/nathan-harper/21/696/b81> > CFMS Services Ltd// Bristol & Bath Science Park // Dirac Crescent // > Emersons Green // Bristol // BS16 7FR > "> CFMS Services Ltd is registered in England and Wales No 05742022 - a > subsidiary of CFMS Ltd > CFMS Services Ltd registered office // Victoria House // 51 Victoria > Street // Bristol // BS1 6AD > > On 16 November 2015 at 16:50, Jose Tavares <jat@xxxxxxxxxxxx > <mailto:jat@xxxxxxxxxxxx>> wrote: > > Hi guys ... > I need some help as my cluster seems to be corrupted. > > I saw here .. > https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg01919.html > .. a msg from 2013 where Peter had a problem with his monitors. > > I had the same problem today when trying to add a new monitor, > and than playing with monmap as the monitors were not entering > the quorum. I'm using version 0.80.8. > > Right now my cluster won't start because of a corrupted monitor. > Is it possible to remove all monitors and create just a new one > without losing data? I have ~260GB of data with work from 2 weeks. > > What should I do? Do you recommend any specific procedure? > > Thanks a lot. > Jose Tavares > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com