On Tuesday, July 23, 2013 at 4:46 PM, peter@xxxxxxxxx wrote:
On 2013-07-22 18:20, Joao Eduardo Luis wrote:On 07/22/2013 04:59 PM, peter@xxxxxxxxx wrote:Hi Joao,I have sent you the link to the monitor files. I stopped one othermonitor to have a consistent tarball but now it won't start, crashingwith the same error message. I hope there is a trick to get itworkingagain because now I only have one monitor working and I don't want toend up losing data again (I had this happen once before).Thanks! This is the very next thing in my queue!-JoaoHi Joao,Any update on this issue perhaps? It seems I'm not the only one withthis problem. Our cluster isn't working anymore (only 1 monitor left) soI'd recommend anyone running 0.61.5 not to reboot or restart theirmonitors until it is know what is going on :(
I just rebooted one mon server running 0.61.5 (had to!)
and it didn't crash (yet?). I guess I was lucky…
Cheers, Dan
Thanks,PeterOn 2013-07-22 17:31, Joao Eduardo Luis wrote:On 07/22/2013 12:33 PM, peter@xxxxxxxxx wrote:Hello,After a reboot one of our monitors is unable to start. We did anupgradefrom 0.61.4 to 0.61.5 last week without problems (the monitorrestartedjust fine).We are getting the following error (I think it is the same as:http://tracker.ceph.com/issues/5704). I might have missed it on thelistthough. If you want I can send the contents of the monitordirectory.That monitor store would be greatly appreciated! If you couldbundlethe store of two other monitors it would be great.-Joao2013-07-22 13:24:02.183558 7fd06127e780 0 ceph version 0.61.5(8ee10dc4bb73bdd918873f29c70eedc3c7ef1979), process ceph-mon, pid285402013-07-22 13:24:02.251205 7fd05d320700 -1 asok(0x207e000)AdminSocket:request 'mon_status' not defined2013-07-22 13:24:02.357287 7fd06127e780 1 mon.narr9@-1(probing) e1preinit fsid 97e515bb-d334-4fa7-8b53-7d85615809fd2013-07-22 13:24:02.374158 7fd06127e780 -1 mon/OSDMonitor.cc: Infunction 'virtual void OSDMonitor::update_from_paxos(bool*)' thread7fd06127e780 time 2013-07-22 13:24:02.373344mon/OSDMonitor.cc: 132: FAILED assert(latest_bl.length() != 0)ceph version 0.61.5 (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979)1: /usr/bin/ceph-mon() [0x5073d6]2: (PaxosService::refresh(bool*)+0x19b) [0x4edd4b]3: (Monitor::refresh_from_paxos(bool*)+0x57) [0x48e5a7]4: (Monitor::init_paxos()+0xf5) [0x48e755]5: (Monitor::preinit()+0x6ac) [0x4a4e7c]6: (main()+0x1c19) [0x483559]7: (__libc_start_main()+0xed) [0x7fd05f4da76d]8: /usr/bin/ceph-mon() [0x485e7d]NOTE: a copy of the executable, or `objdump -rdS <executable>` isneeded to interpret this.--- begin dump of recent events ----26> 2013-07-22 13:24:02.181870 7fd06127e780 5 asok(0x207e000)register_command perfcounters_dump hook 0x2073010-25> 2013-07-22 13:24:02.181908 7fd06127e780 5 asok(0x207e000)register_command 1 hook 0x2073010-24> 2013-07-22 13:24:02.181915 7fd06127e780 5 asok(0x207e000)register_command perf dump hook 0x2073010-23> 2013-07-22 13:24:02.181929 7fd06127e780 5 asok(0x207e000)register_command perfcounters_schema hook 0x2073010-22> 2013-07-22 13:24:02.181939 7fd06127e780 5 asok(0x207e000)register_command 2 hook 0x2073010-21> 2013-07-22 13:24:02.181941 7fd06127e780 5 asok(0x207e000)register_command perf schema hook 0x2073010-20> 2013-07-22 13:24:02.181945 7fd06127e780 5 asok(0x207e000)register_command config show hook 0x2073010-19> 2013-07-22 13:24:02.181954 7fd06127e780 5 asok(0x207e000)register_command config set hook 0x2073010-18> 2013-07-22 13:24:02.181957 7fd06127e780 5 asok(0x207e000)register_command log flush hook 0x2073010-17> 2013-07-22 13:24:02.181959 7fd06127e780 5 asok(0x207e000)register_command log dump hook 0x2073010-16> 2013-07-22 13:24:02.181964 7fd06127e780 5 asok(0x207e000)register_command log reopen hook 0x2073010-15> 2013-07-22 13:24:02.183558 7fd06127e780 0 ceph version0.61.5(8ee10dc4bb73bdd918873f29c70eedc3c7ef1979), process ceph-mon, pid28540-14> 2013-07-22 13:24:02.186703 7fd06127e780 5 asok(0x207e000)init/var/run/ceph/ceph-mon.narr9.asok-13> 2013-07-22 13:24:02.186734 7fd06127e780 5 asok(0x207e000)bind_and_listen /var/run/ceph/ceph-mon.narr9.asok-12> 2013-07-22 13:24:02.186780 7fd06127e780 5 asok(0x207e000)register_command 0 hook 0x20720b0-11> 2013-07-22 13:24:02.186790 7fd06127e780 5 asok(0x207e000)register_command version hook 0x20720b0-10> 2013-07-22 13:24:02.186798 7fd06127e780 5 asok(0x207e000)register_command git_version hook 0x20720b0-9> 2013-07-22 13:24:02.186806 7fd06127e780 5 asok(0x207e000)register_command help hook 0x20730d0-8> 2013-07-22 13:24:02.186850 7fd05d320700 5 asok(0x207e000)entry start-7> 2013-07-22 13:24:02.251205 7fd05d320700 -1 asok(0x207e000)AdminSocket: request 'mon_status' not defined-6> 2013-07-22 13:24:02.357202 7fd06127e780 1 --10.255.0.25:6789/0 learned my addr 10.255.0.25:6789/0-5> 2013-07-22 13:24:02.357215 7fd06127e780 1accepter.accepter.bind my_inst.addr is 10.255.0.25:6789/0need_addr=0-4> 2013-07-22 13:24:02.357242 7fd06127e780 5 adding authprotocol: cephx-3> 2013-07-22 13:24:02.357245 7fd06127e780 5 adding authprotocol: cephx-2> 2013-07-22 13:24:02.357287 7fd06127e780 1mon.narr9@-1(probing) e1 preinit fsid97e515bb-d334-4fa7-8b53-7d85615809fd-1> 2013-07-22 13:24:02.372987 7fd06127e780 4mon.narr9@-1(probing).mds e182116 new map0> 2013-07-22 13:24:02.374158 7fd06127e780 -1mon/OSDMonitor.cc:In function 'virtual void OSDMonitor::update_from_paxos(bool*)'thread7fd06127e780 time 2013-07-22 13:24:02.373344mon/OSDMonitor.cc: 132: FAILED assert(latest_bl.length() != 0)ceph version 0.61.5 (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979)1: /usr/bin/ceph-mon() [0x5073d6]2: (PaxosService::refresh(bool*)+0x19b) [0x4edd4b]3: (Monitor::refresh_from_paxos(bool*)+0x57) [0x48e5a7]4: (Monitor::init_paxos()+0xf5) [0x48e755]5: (Monitor::preinit()+0x6ac) [0x4a4e7c]6: (main()+0x1c19) [0x483559]7: (__libc_start_main()+0xed) [0x7fd05f4da76d]8: /usr/bin/ceph-mon() [0x485e7d]NOTE: a copy of the executable, or `objdump -rdS <executable>` isneeded to interpret this.--- logging levels ---0/ 5 none0/ 1 lockdep0/ 1 context1/ 1 crush1/ 5 mds1/ 5 mds_balancer1/ 5 mds_locker1/ 5 mds_log1/ 5 mds_log_expire1/ 5 mds_migrator0/ 1 buffer0/ 1 timer0/ 1 filer0/ 1 striper0/ 1 objecter0/ 5 rados0/ 5 rbd0/ 5 journaler0/ 5 objectcacher0/ 5 client0/ 5 osd0/ 5 optracker0/ 5 objclass1/ 3 filestore1/ 3 journal0/ 5 ms1/ 5 mon0/10 monc0/ 5 paxos0/ 5 tp1/ 5 auth1/ 5 crypto1/ 1 finisher1/ 5 heartbeatmap1/ 5 perfcounter1/ 5 rgw1/ 5 hadoop1/ 5 javaclient1/ 5 asok1/ 1 throttle-2/-2 (syslog threshold)-1/-1 (stderr threshold)max_recent 10000max_new 1000log_file /var/log/ceph/ceph-mon.narr9.log--- end dump of recent events ---2013-07-22 13:24:02.376004 7fd06127e780 -1 *** Caught signal(Aborted) **in thread 7fd06127e780ceph version 0.61.5 (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979)1: /usr/bin/ceph-mon() [0x59743a]2: (()+0xfcb0) [0x7fd060919cb0]3: (gsignal()+0x35) [0x7fd05f4ef425]4: (abort()+0x17b) [0x7fd05f4f2b8b]5: (__gnu_cxx::__verbose_terminate_handler()+0x11d)[0x7fd05fe4169d]6: (()+0xb5846) [0x7fd05fe3f846]7: (()+0xb5873) [0x7fd05fe3f873]8: (()+0xb596e) [0x7fd05fe3f96e]9: (ceph::__ceph_assert_fail(char const*, char const*, int, charconst*)+0x1df) [0x64f6ef]10: /usr/bin/ceph-mon() [0x5073d6]11: (PaxosService::refresh(bool*)+0x19b) [0x4edd4b]12: (Monitor::refresh_from_paxos(bool*)+0x57) [0x48e5a7]13: (Monitor::init_paxos()+0xf5) [0x48e755]14: (Monitor::preinit()+0x6ac) [0x4a4e7c]15: (main()+0x1c19) [0x483559]16: (__libc_start_main()+0xed) [0x7fd05f4da76d]17: /usr/bin/ceph-mon() [0x485e7d]NOTE: a copy of the executable, or `objdump -rdS <executable>` isneeded to interpret this.--- begin dump of recent events ---0> 2013-07-22 13:24:02.376004 7fd06127e780 -1 *** Caughtsignal(Aborted) **in thread 7fd06127e780ceph version 0.61.5 (8ee10dc4bb73bdd918873f29c70eedc3c7ef1979)1: /usr/bin/ceph-mon() [0x59743a]2: (()+0xfcb0) [0x7fd060919cb0]3: (gsignal()+0x35) [0x7fd05f4ef425]4: (abort()+0x17b) [0x7fd05f4f2b8b]5: (__gnu_cxx::__verbose_terminate_handler()+0x11d)[0x7fd05fe4169d]6: (()+0xb5846) [0x7fd05fe3f846]7: (()+0xb5873) [0x7fd05fe3f873]8: (()+0xb596e) [0x7fd05fe3f96e]9: (ceph::__ceph_assert_fail(char const*, char const*, int, charconst*)+0x1df) [0x64f6ef]10: /usr/bin/ceph-mon() [0x5073d6]11: (PaxosService::refresh(bool*)+0x19b) [0x4edd4b]12: (Monitor::refresh_from_paxos(bool*)+0x57) [0x48e5a7]13: (Monitor::init_paxos()+0xf5) [0x48e755]14: (Monitor::preinit()+0x6ac) [0x4a4e7c]15: (main()+0x1c19) [0x483559]16: (__libc_start_main()+0xed) [0x7fd05f4da76d]17: /usr/bin/ceph-mon() [0x485e7d]NOTE: a copy of the executable, or `objdump -rdS <executable>` isneeded to interpret this.--- logging levels ---0/ 5 none0/ 1 lockdep0/ 1 context1/ 1 crush1/ 5 mds1/ 5 mds_balancer1/ 5 mds_locker1/ 5 mds_log1/ 5 mds_log_expire1/ 5 mds_migrator0/ 1 buffer0/ 1 timer0/ 1 filer0/ 1 striper0/ 1 objecter0/ 5 rados0/ 5 rbd0/ 5 journaler0/ 5 objectcacher0/ 5 client0/ 5 osd0/ 5 optracker0/ 5 objclass1/ 3 filestore1/ 3 journal0/ 5 ms1/ 5 mon0/10 monc0/ 5 paxos0/ 5 tp1/ 5 auth1/ 5 crypto1/ 1 finisher1/ 5 heartbeatmap1/ 5 perfcounter1/ 5 rgw1/ 5 hadoop1/ 5 javaclient1/ 5 asok1/ 1 throttle-2/-2 (syslog threshold)-1/-1 (stderr threshold)max_recent 10000max_new 1000log_file /var/log/ceph/ceph-mon.narr9.log--- end dump of recent events ---Cheers,Peter_______________________________________________ceph-users mailing list_______________________________________________ceph-users mailing list_______________________________________________ceph-users mailing list
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com