Hi John
Thanks for that, life saver! Running on Debian Jessie and I replaced the mail ceph repo in source.d to:deb http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ref/wip-17466-jewel/ jessie main
On 7 October 2016 at 11:37, John Spray <jspray@xxxxxxxxxx> wrote:
It looks like you're hitting this:On Fri, Oct 7, 2016 at 8:04 AM, James Horner <humankind135@xxxxxxxxx> wrote:
> Hi All
>
> Just wondering if anyone can help me out here. Small home cluster with 1
> mon, the next phase of the plan called for more but I hadn't got there yet.
>
> I was trying to setup Cephfs and I ran "ceph fs new" without having an MDS
> as I was having issues with rank 0 immediately being degraded. My thinking
> was that I would bring up an MDS and it would be assigned to rank 0. Anyhoo
> after I did that my mon crashed and I havn't been able to restart it since,
> its output is:
>
> root@bertie ~ $ /usr/bin/ceph-mon -f --cluster ceph --id bertie --setuser
> ceph --setgroup ceph 2>&1 | tee /var/log/ceph/mon-temp
> starting mon.bertie rank 0 at 192.168.2.3:6789/0 mon_data
> /var/lib/ceph/mon/ceph-bertie fsid 06e2f4e0-35e1-4f8c-b2a0-bc72c4cd3199
> terminate called after throwing an instance of 'std::out_of_range'
> what(): map::at
> *** Caught signal (Aborted) **
> in thread 7fad7f86c480 thread_name:ceph-mon
> ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53c c93f92e65b)
> 1: (()+0x525737) [0x56219142b737]
> 2: (()+0xf8d0) [0x7fad7eb3c8d0]
> 3: (gsignal()+0x37) [0x7fad7cdc6067]
> 4: (abort()+0x148) [0x7fad7cdc7448]
> 5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fad7d6b3b3d]
> 6: (()+0x5ebb6) [0x7fad7d6b1bb6]
> 7: (()+0x5ec01) [0x7fad7d6b1c01]
> 8: (()+0x5ee19) [0x7fad7d6b1e19]
> 9: (std::__throw_out_of_range(char const*)+0x66) [0x7fad7d707b76]
> 10: (FSMap::get_filesystem(int) const+0x7c) [0x56219126ed6c]
> 11: (MDSMonitor::maybe_promote_standby(std::shared_ptr< Filesystem>)+0x48a)
> [0x56219125b13a]
> 12: (MDSMonitor::tick()+0x4bb) [0x56219126084b]
> 13: (MDSMonitor::on_active()+0x28) [0x562191255da8]
> 14: (PaxosService::_active()+0x60a) [0x5621911d896a]
> 15: (PaxosService::election_finished()+0x7a) [0x5621911d8d7a]
> 16: (Monitor::win_election(unsigned int, std::set<int, std::less<int>,
> std::allocator<int> >&, unsigned long, MonCommand const*, int, std::set<int,
> std::less<int>, std::allocator<int> > const*)+0x24e) [0x5621911958ce]
> 17: (Monitor::win_standalone_election()+0x20f) [0x562191195d9f]
> 18: (Monitor::bootstrap()+0x91b) [0x56219119676b]
> 19: (Monitor::init()+0x17d) [0x562191196a5d]
> 20: (main()+0x2694) [0x562191106f44]
> 21: (__libc_start_main()+0xf5) [0x7fad7cdb2b45]
> 22: (()+0x257edf) [0x56219115dedf]
> 2016-10-07 06:50:39.049061 7fad7f86c480 -1 *** Caught signal (Aborted) **
> in thread 7fad7f86c480 thread_name:ceph-mon
>
> ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53c c93f92e65b)
> 1: (()+0x525737) [0x56219142b737]
> 2: (()+0xf8d0) [0x7fad7eb3c8d0]
> 3: (gsignal()+0x37) [0x7fad7cdc6067]
> 4: (abort()+0x148) [0x7fad7cdc7448]
> 5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fad7d6b3b3d]
> 6: (()+0x5ebb6) [0x7fad7d6b1bb6]
> 7: (()+0x5ec01) [0x7fad7d6b1c01]
> 8: (()+0x5ee19) [0x7fad7d6b1e19]
> 9: (std::__throw_out_of_range(char const*)+0x66) [0x7fad7d707b76]
> 10: (FSMap::get_filesystem(int) const+0x7c) [0x56219126ed6c]
> 11: (MDSMonitor::maybe_promote_standby(std::shared_ptr< Filesystem>)+0x48a)
> [0x56219125b13a]
> 12: (MDSMonitor::tick()+0x4bb) [0x56219126084b]
> 13: (MDSMonitor::on_active()+0x28) [0x562191255da8]
> 14: (PaxosService::_active()+0x60a) [0x5621911d896a]
> 15: (PaxosService::election_finished()+0x7a) [0x5621911d8d7a]
> 16: (Monitor::win_election(unsigned int, std::set<int, std::less<int>,
> std::allocator<int> >&, unsigned long, MonCommand const*, int, std::set<int,
> std::less<int>, std::allocator<int> > const*)+0x24e) [0x5621911958ce]
> 17: (Monitor::win_standalone_election()+0x20f) [0x562191195d9f]
> 18: (Monitor::bootstrap()+0x91b) [0x56219119676b]
> 19: (Monitor::init()+0x17d) [0x562191196a5d]
> 20: (main()+0x2694) [0x562191106f44]
> 21: (__libc_start_main()+0xf5) [0x7fad7cdb2b45]
> 22: (()+0x257edf) [0x56219115dedf]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>
> 0> 2016-10-07 06:50:39.049061 7fad7f86c480 -1 *** Caught signal
> (Aborted) **
> in thread 7fad7f86c480 thread_name:ceph-mon
>
> ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53c c93f92e65b)
> 1: (()+0x525737) [0x56219142b737]
> 2: (()+0xf8d0) [0x7fad7eb3c8d0]
> 3: (gsignal()+0x37) [0x7fad7cdc6067]
> 4: (abort()+0x148) [0x7fad7cdc7448]
> 5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fad7d6b3b3d]
> 6: (()+0x5ebb6) [0x7fad7d6b1bb6]
> 7: (()+0x5ec01) [0x7fad7d6b1c01]
> 8: (()+0x5ee19) [0x7fad7d6b1e19]
> 9: (std::__throw_out_of_range(char const*)+0x66) [0x7fad7d707b76]
> 10: (FSMap::get_filesystem(int) const+0x7c) [0x56219126ed6c]
> 11: (MDSMonitor::maybe_promote_standby(std::shared_ptr< Filesystem>)+0x48a)
> [0x56219125b13a]
> 12: (MDSMonitor::tick()+0x4bb) [0x56219126084b]
> 13: (MDSMonitor::on_active()+0x28) [0x562191255da8]
> 14: (PaxosService::_active()+0x60a) [0x5621911d896a]
> 15: (PaxosService::election_finished()+0x7a) [0x5621911d8d7a]
> 16: (Monitor::win_election(unsigned int, std::set<int, std::less<int>,
> std::allocator<int> >&, unsigned long, MonCommand const*, int, std::set<int,
> std::less<int>, std::allocator<int> > const*)+0x24e) [0x5621911958ce]
> 17: (Monitor::win_standalone_election()+0x20f) [0x562191195d9f]
> 18: (Monitor::bootstrap()+0x91b) [0x56219119676b]
> 19: (Monitor::init()+0x17d) [0x562191196a5d]
> 20: (main()+0x2694) [0x562191106f44]
> 21: (__libc_start_main()+0xf5) [0x7fad7cdb2b45]
> 22: (()+0x257edf) [0x56219115dedf]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
> interpret this.
>
> Fairly sure its a Cephfs Error due to :
> 9: (std::__throw_out_of_range(char const*)+0x66) [0x7fad7d707b76]
> 10: (FSMap::get_filesystem(int) const+0x7c) [0x56219126ed6c]
http://tracker.ceph.com/issues/17466
There is a branch called wip-17466-jewel that has a fix cherry picked
onto 10.2.3 -- hopefully if you install the mon from that branch then
your mons will be happy again.
Packages:
http://gitbuilder.ceph.com/ceph-deb-trusty-x86_64-basic/ ref/wip-17466-jewel/
http://gitbuilder.ceph.com/ceph-rpm-centos7-x86_64-basic/ ref/wip-17466-jewel/
Or of course you can build your own if you're on a platform that isn't
on gitbuilder.ceph.com
John
> I have nothing in the CephFS but I had just finished moving all my VMs into
> rados. I don't care if CephFS gets wiped but I really need the vm images.
>
> If the mon is borked permanently then is there a way I can recover the
> images manually?
>
> Thanks in advance for any help
>
> James
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
>
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com