On Fri, Oct 7, 2016 at 8:04 AM, James Horner <humankind135@xxxxxxxxx> wrote: > Hi All > > Just wondering if anyone can help me out here. Small home cluster with 1 > mon, the next phase of the plan called for more but I hadn't got there yet. > > I was trying to setup Cephfs and I ran "ceph fs new" without having an MDS > as I was having issues with rank 0 immediately being degraded. My thinking > was that I would bring up an MDS and it would be assigned to rank 0. Anyhoo > after I did that my mon crashed and I havn't been able to restart it since, > its output is: > > root@bertie ~ $ /usr/bin/ceph-mon -f --cluster ceph --id bertie --setuser > ceph --setgroup ceph 2>&1 | tee /var/log/ceph/mon-temp > starting mon.bertie rank 0 at 192.168.2.3:6789/0 mon_data > /var/lib/ceph/mon/ceph-bertie fsid 06e2f4e0-35e1-4f8c-b2a0-bc72c4cd3199 > terminate called after throwing an instance of 'std::out_of_range' > what(): map::at > *** Caught signal (Aborted) ** > in thread 7fad7f86c480 thread_name:ceph-mon > ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) > 1: (()+0x525737) [0x56219142b737] > 2: (()+0xf8d0) [0x7fad7eb3c8d0] > 3: (gsignal()+0x37) [0x7fad7cdc6067] > 4: (abort()+0x148) [0x7fad7cdc7448] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fad7d6b3b3d] > 6: (()+0x5ebb6) [0x7fad7d6b1bb6] > 7: (()+0x5ec01) [0x7fad7d6b1c01] > 8: (()+0x5ee19) [0x7fad7d6b1e19] > 9: (std::__throw_out_of_range(char const*)+0x66) [0x7fad7d707b76] > 10: (FSMap::get_filesystem(int) const+0x7c) [0x56219126ed6c] > 11: (MDSMonitor::maybe_promote_standby(std::shared_ptr<Filesystem>)+0x48a) > [0x56219125b13a] > 12: (MDSMonitor::tick()+0x4bb) [0x56219126084b] > 13: (MDSMonitor::on_active()+0x28) [0x562191255da8] > 14: (PaxosService::_active()+0x60a) [0x5621911d896a] > 15: (PaxosService::election_finished()+0x7a) [0x5621911d8d7a] > 16: (Monitor::win_election(unsigned int, std::set<int, std::less<int>, > std::allocator<int> >&, unsigned long, MonCommand const*, int, std::set<int, > std::less<int>, std::allocator<int> > const*)+0x24e) [0x5621911958ce] > 17: (Monitor::win_standalone_election()+0x20f) [0x562191195d9f] > 18: (Monitor::bootstrap()+0x91b) [0x56219119676b] > 19: (Monitor::init()+0x17d) [0x562191196a5d] > 20: (main()+0x2694) [0x562191106f44] > 21: (__libc_start_main()+0xf5) [0x7fad7cdb2b45] > 22: (()+0x257edf) [0x56219115dedf] > 2016-10-07 06:50:39.049061 7fad7f86c480 -1 *** Caught signal (Aborted) ** > in thread 7fad7f86c480 thread_name:ceph-mon > > ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) > 1: (()+0x525737) [0x56219142b737] > 2: (()+0xf8d0) [0x7fad7eb3c8d0] > 3: (gsignal()+0x37) [0x7fad7cdc6067] > 4: (abort()+0x148) [0x7fad7cdc7448] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fad7d6b3b3d] > 6: (()+0x5ebb6) [0x7fad7d6b1bb6] > 7: (()+0x5ec01) [0x7fad7d6b1c01] > 8: (()+0x5ee19) [0x7fad7d6b1e19] > 9: (std::__throw_out_of_range(char const*)+0x66) [0x7fad7d707b76] > 10: (FSMap::get_filesystem(int) const+0x7c) [0x56219126ed6c] > 11: (MDSMonitor::maybe_promote_standby(std::shared_ptr<Filesystem>)+0x48a) > [0x56219125b13a] > 12: (MDSMonitor::tick()+0x4bb) [0x56219126084b] > 13: (MDSMonitor::on_active()+0x28) [0x562191255da8] > 14: (PaxosService::_active()+0x60a) [0x5621911d896a] > 15: (PaxosService::election_finished()+0x7a) [0x5621911d8d7a] > 16: (Monitor::win_election(unsigned int, std::set<int, std::less<int>, > std::allocator<int> >&, unsigned long, MonCommand const*, int, std::set<int, > std::less<int>, std::allocator<int> > const*)+0x24e) [0x5621911958ce] > 17: (Monitor::win_standalone_election()+0x20f) [0x562191195d9f] > 18: (Monitor::bootstrap()+0x91b) [0x56219119676b] > 19: (Monitor::init()+0x17d) [0x562191196a5d] > 20: (main()+0x2694) [0x562191106f44] > 21: (__libc_start_main()+0xf5) [0x7fad7cdb2b45] > 22: (()+0x257edf) [0x56219115dedf] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > 0> 2016-10-07 06:50:39.049061 7fad7f86c480 -1 *** Caught signal > (Aborted) ** > in thread 7fad7f86c480 thread_name:ceph-mon > > ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) > 1: (()+0x525737) [0x56219142b737] > 2: (()+0xf8d0) [0x7fad7eb3c8d0] > 3: (gsignal()+0x37) [0x7fad7cdc6067] > 4: (abort()+0x148) [0x7fad7cdc7448] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fad7d6b3b3d] > 6: (()+0x5ebb6) [0x7fad7d6b1bb6] > 7: (()+0x5ec01) [0x7fad7d6b1c01] > 8: (()+0x5ee19) [0x7fad7d6b1e19] > 9: (std::__throw_out_of_range(char const*)+0x66) [0x7fad7d707b76] > 10: (FSMap::get_filesystem(int) const+0x7c) [0x56219126ed6c] > 11: (MDSMonitor::maybe_promote_standby(std::shared_ptr<Filesystem>)+0x48a) > [0x56219125b13a] > 12: (MDSMonitor::tick()+0x4bb) [0x56219126084b] > 13: (MDSMonitor::on_active()+0x28) [0x562191255da8] > 14: (PaxosService::_active()+0x60a) [0x5621911d896a] > 15: (PaxosService::election_finished()+0x7a) [0x5621911d8d7a] > 16: (Monitor::win_election(unsigned int, std::set<int, std::less<int>, > std::allocator<int> >&, unsigned long, MonCommand const*, int, std::set<int, > std::less<int>, std::allocator<int> > const*)+0x24e) [0x5621911958ce] > 17: (Monitor::win_standalone_election()+0x20f) [0x562191195d9f] > 18: (Monitor::bootstrap()+0x91b) [0x56219119676b] > 19: (Monitor::init()+0x17d) [0x562191196a5d] > 20: (main()+0x2694) [0x562191106f44] > 21: (__libc_start_main()+0xf5) [0x7fad7cdb2b45] > 22: (()+0x257edf) [0x56219115dedf] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to > interpret this. > > Fairly sure its a Cephfs Error due to : > 9: (std::__throw_out_of_range(char const*)+0x66) [0x7fad7d707b76] > 10: (FSMap::get_filesystem(int) const+0x7c) [0x56219126ed6c] It looks like you're hitting this: http://tracker.ceph.com/issues/17466 There is a branch called wip-17466-jewel that has a fix cherry picked onto 10.2.3 -- hopefully if you install the mon from that branch then your mons will be happy again. Packages: http://gitbuilder.ceph.com/ceph-deb-trusty-x86_64-basic/ref/wip-17466-jewel/ http://gitbuilder.ceph.com/ceph-rpm-centos7-x86_64-basic/ref/wip-17466-jewel/ Or of course you can build your own if you're on a platform that isn't on gitbuilder.ceph.com John > I have nothing in the CephFS but I had just finished moving all my VMs into > rados. I don't care if CephFS gets wiped but I really need the vm images. > > If the mon is borked permanently then is there a way I can recover the > images manually? > > Thanks in advance for any help > > James > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com