On Thu, Apr 18, 2013 at 2:46 PM, Joao Eduardo Luis <joao.luis@xxxxxxxxxxx> wrote: > On 04/18/2013 10:36 PM, Gregory Farnum wrote: >> >> (I believe your monitor crash is something else, Matthew; if that >> hasn't been dealt with yet. Unfortunately all that log has is >> messages, so it probably needs a bit more. Can you check it out, Joao? > > > The stack trace below is #3495, and Matthew is already testing the fix (as > per the tracker, so far so good, but we we should know more in the next day > or so). > > >> It appears to be a follower which ends up in propose_pending, which is >> distinctly odd...) > > > I might be missing something, but what gave you that impression? That would > certainly be odd (to say the least!) I could have just missed some message traffic (or misread what's there), but there is a pont where I think it's forwarding a command to the leader, and the crash is in propose_pending. I like your answers better. ;) -Greg > > -Joao > > >> Thanks for the bug report! >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> >> >> On Mon, Apr 8, 2013 at 7:39 AM, Mike Dawson <mdawson@xxxxxxxxxxxxx> wrote: >>> >>> Matthew, >>> >>> I have seen the same behavior on 0.59. Ran through some troubleshooting >>> with >>> Dan and Joao on March 21st and 22nd, but I haven't looked at it since >>> then. >>> >>> If you look at running processes, I believe you'll see an instance of >>> ceph-create-keys start each time you start a Monitor. So, if you restart >>> the >>> monitor several times, you'll have several ceph-create-keys processes >>> piling, essentially leaking processes. IIRC, the tmp files you see in >>> /etc/ceph correspond with the ceph-create-keys PID. Can you confirm >>> that's >>> what you are seeing? >>> >>> I haven't looked in a couple weeks, but I hope to start 0.60 later today. >>> >>> - Mike >>> >>> >>> >>> >>> >>> >>> On 4/8/2013 12:43 AM, Matthew Roy wrote: >>>> >>>> >>>> I'm seeing weird messages in my monitor logs that don't correlate to >>>> admin activity: >>>> >>>> 2013-04-07 22:54:11.528871 7f2e9e6c8700 1 -- >>>> [2001:<something>::20]:6789/0 --> [2001:<something>::20]:0/1920 -- >>>> mon_command_ack([auth,get-or-create,client.admin,mon,allow *,osd,allow >>>> *,mds,allow]=-13 access denied v134192) v1 -- ?+0 0x37bfc00 con >>>> 0x3716840 >>>> >>>> It's also writing out a bunch of empty files along the lines of >>>> "ceph.client.admin.keyring.1008.tmp" in /etc/ceph/ Could this be related >>>> to the mon trying to "Starting ceph-create-keys" when starting? >>>> >>>> This could be the cause of, or just associated with, some general >>>> instability of the monitor cluster. After increasing the logging level I >>>> did catch one crash: >>>> >>>> ceph version 0.60 (f26f7a39021dbf440c28d6375222e21c94fe8e5c) >>>> 1: /usr/bin/ceph-mon() [0x5834fa] >>>> 2: (()+0xfcb0) [0x7f4b03328cb0] >>>> 3: (gsignal()+0x35) [0x7f4b01efe425] >>>> 4: (abort()+0x17b) [0x7f4b01f01b8b] >>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f4b0285069d] >>>> 6: (()+0xb5846) [0x7f4b0284e846] >>>> 7: (()+0xb5873) [0x7f4b0284e873] >>>> 8: (()+0xb596e) [0x7f4b0284e96e] >>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>>> const*)+0x1df) [0x636c8f] >>>> 10: (PaxosService::propose_pending()+0x46d) [0x4dee3d] >>>> 11: (MDSMonitor::tick()+0x1c62) [0x51cdd2] >>>> 12: (MDSMonitor::on_active()+0x1a) [0x512ada] >>>> 13: (PaxosService::_active()+0x31d) [0x4e067d] >>>> 14: (Context::complete(int)+0xa) [0x4b7b4a] >>>> 15: (finish_contexts(CephContext*, std::list<Context*, >>>> std::allocator<Context*> >&, int)+0x95) [0x4ba5a5] >>>> 16: (Paxos::handle_last(MMonPaxos*)+0xbef) [0x4da92f] >>>> 17: (Paxos::dispatch(PaxosServiceMessage*)+0x26b) [0x4dad8b] >>>> 18: (Monitor::_ms_dispatch(Message*)+0x149f) [0x4b310f] >>>> 19: (Monitor::ms_dispatch(Message*)+0x32) [0x4c9d12] >>>> 20: (DispatchQueue::entry()+0x341) [0x698da1] >>>> 21: (DispatchQueue::DispatchThread::entry()+0xd) [0x626c5d] >>>> 22: (()+0x7e9a) [0x7f4b03320e9a] >>>> 23: (clone()+0x6d) [0x7f4b01fbbcbd] >>>> >>>> The complete log is at: http://goo.gl/UmNs3 >>>> >>>> >>>> Does anyone recognize what's going on? >>>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Joao Eduardo Luis > Software Engineer | http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com