Hi Paul, Yes that's what I did - caused some errors. In the end I had to delete the /var/lib/ceph/mon/* directory in the bad node and run inject with --mkfs argument to recreate the database. I am good now - thanks. :) On Tue, Jul 10, 2018 at 10:46 PM, Paul Emmerich <paul.emmerich@xxxxxxxx> wrote: > easy: > > 1. make sure that none of the mons are running > 2. extract the monmap from the good one > 3. use monmaptool to remove the two other mons from it > 4. inject the mon map back into the good mon > 5. start the good mon > 6. you now have a running cluster with only one mon, add two new ones > > > Paul > > > 2018-07-10 5:50 GMT+02:00 Syahrul Sazli Shaharir <sazli@xxxxxxxxxx>: >> >> Hi, >> >> I am running proxmox pve-5.1, with ceph luminous 12.2.4 as storage. I >> have been running on 3 monitors, up until an abrupt power outage, >> resulting in 2 monitors down and unable to start, while 1 monitor up >> but with no quorum. >> >> I tried extracting monmap from the good monitor and injecting it into >> the other two, but got different errors for each:- >> >> 1. mon.mail1 >> >> # ceph-mon -i mail1 --inject-monmap /tmp/monmap >> 2018-07-10 11:29:03.562840 7f7d82845f80 -1 abort: Corruption: Bad >> table magic number*** Caught signal (Aborted) ** >> in thread 7f7d82845f80 thread_name:ceph-mon >> >> ceph version 12.2.4 (4832b6f0acade977670a37c20ff5dbe69e727416) >> luminous (stable) >> 1: (()+0x9439e4) [0x5652655669e4] >> 2: (()+0x110c0) [0x7f7d81bfe0c0] >> 3: (gsignal()+0xcf) [0x7f7d7ee12fff] >> 4: (abort()+0x16a) [0x7f7d7ee1442a] >> 5: (RocksDBStore::get(std::__cxx11::basic_string<char, >> std::char_traits<char>, std::allocator<char> > const&, >> std::__cxx11::basic_string<char, std::char_traits<char>, >> std::allocator<char> > const&, ceph::buffer::list*)+0x2f9) >> [0x5652650a2eb9] >> 6: (main()+0x1377) [0x565264ec3c57] >> 7: (__libc_start_main()+0xf1) [0x7f7d7ee002e1] >> 8: (_start()+0x2a) [0x565264f5954a] >> 2018-07-10 11:29:03.563721 7f7d82845f80 -1 *** Caught signal (Aborted) ** >> in thread 7f7d82845f80 thread_name:ceph-mon >> >> 2. mon,mail2 >> >> # ceph-mon -i mail2 --inject-monmap /tmp/monmap >> 2018-07-10 11:18:07.536097 7f161e2e3f80 -1 rocksdb: Corruption: Can't >> access /065339.sst: IO error: >> /var/lib/ceph/mon/ceph-mail2/store.db/065339.sst: No such file or >> directory >> Can't access /065337.sst: IO error: >> /var/lib/ceph/mon/ceph-mail2/store.db/065337.sst: No such file or >> directory >> >> 2018-07-10 11:18:07.536106 7f161e2e3f80 -1 error opening mon data >> directory at '/var/lib/ceph/mon/ceph-mail2': (22) Invalid argument >> >> Any other way I can recover other than rebuilding the monitor store >> from the OSDs? >> >> Thanks. >> >> -- >> --sazli >> Syahrul Sazli Shaharir <sazli@xxxxxxxxxx> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > 81247 München > www.croit.io > Tel: +49 89 1896585 90 -- --sazli Syahrul Sazli Shaharir <sazli@xxxxxxxxxx> Mobile: +6019 385 8301 - YM/Skype: syahrulsazli System Administrator TMK Pulasan (002339810-M) http://pulasan.my/ 11 Jalan 3/4, 43650 Bandar Baru Bangi, Selangor, Malaysia. Tel/Fax: +603 8926 0338 _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com