MDS crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a trace of an MDS crash. I was running a simple setup (./vstart -d -n), and this is from out/mds.b

This is from the latest wip-getdir branch. I posted some context preceding the crash. I have the full trace if more context is helpful.

-Noah

================================

2011-10-28 15:50:00.251876 7f2f3102b700 mds.1.cache.dir(100000003f6) pop_and_dirty_projected_fnode 0x13ab180 v55
2011-10-28 15:50:00.251902 7f2f3102b700 mds.1.cache.dir(100000003f6) mark_dirty (already dirty) [dir 100000003f6 /tmp/hadoop-nwatkins/mapred/staging/nwatkins/.staging/ [2,head] auth{0=1} pv=55 v=55 cv=0/0 ap=1+2+2 state=1610612738|complete f(v0 m2011-10-28 15:50:00.116185 3=0+3)->f(v0 m2011-10-28 15:50:00.116185 3=0+3) n(v5 rc2011-10-28 15:50:00.116185 b284930 5=2+3)->n(v5 rc2011-10-28 15:50:00.116185 b284930 5=2+3) hs=3+1,ss=0+0 dirty=4 | child replicated dirty authpin 0x12b6770] version 55
2011-10-28 15:50:00.251909 7f2f3102b700 mds.1.cache.dir(100000003f5) pop_and_dirty_projected_fnode 0x13abb40 v52
2011-10-28 15:50:00.251936 7f2f3102b700 mds.1.cache.dir(100000003f5) mark_dirty (already dirty) [dir 100000003f5 /tmp/hadoop-nwatkins/mapred/staging/nwatkins/ [2,head] auth{0=1} pv=52 v=52 cv=0/0 ap=1+1+2 state=1610612738|complete f(v0 m2011-10-28 15:39:07.835948 1=0+1)->f(v0 m2011-10-28 15:39:07.835948 1=0+1) n(v9 rc2011-10-28 15:50:00.116185 b284930 6=2+4)/n(v9 rc2011-10-28 15:46:30.070103 b284930 5=2+3)->n(v9 rc2011-10-28 15:50:00.116185 b284930 6=2+4)/n(v9 rc2011-10-28 15:46:30.070103 b284930 5=2+3) hs=1+0,ss=0+0 dirty=1 | child replicated dirty authpin 0x12b6378] version 52
2011-10-28 15:50:00.251957 7f2f3102b700 mds.1.cache send_dentry_link [dentry #1/tmp/hadoop-nwatkins/mapred/staging/nwatkins/.staging/job_201110281545_0003 [2,head] auth (dn xlock x=1 by 0x135bc00) (dversion lock w=1 last_client=4242) v=54 ap=2+0 inode=0x1311b60 | request lock inodepin dirty authpin 0x1345d80]
2011-10-28 15:50:00.251980 7f2f3102b700 mds.1.server reply_request 0 (Success) client_request(client.4242:11 mkdir #100000003f6/job_201110281545_0003) v1
2011-10-28 15:50:00.251990 7f2f3102b700 mds.1.server apply_allocated_inos 20000000004 / [20000000005~3e8] / 0
2011-10-28 15:50:00.252002 7f2f3102b700 mds.1.inotable: apply_alloc_id 20000000004 to [200000003ed~2fffffffc12]/[200000003ec~2fffffffc13]
./include/interval_set.h: In function 'void interval_set<T>::erase(T, T) [with T = inodeno_t]', in thread '7f2f3102b700'
./include/interval_set.h: 385: FAILED assert(p->first <= start)
 ceph version 0.37-192-g1a4eec2 (commit:1a4eec20a345ced993a48012aaaa8d8ca344a1ba)
 1: (InoTable::apply_alloc_id(inodeno_t)+0x441) [0x647041]
 2: (Server::apply_allocated_inos(MDRequest*)+0x4dd) [0x509f3d]
 3: (Server::reply_request(MDRequest*, MClientReply*, CInode*, CDentry*)+0x83) [0x50a283]
 4: (C_MDS_mknod_finish::finish(int)+0xfe) [0x53686e]
 5: (Context::complete(int)+0xa) [0x4a4d7a]
 6: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0xc8) [0x4c3568]
 7: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x18f) [0x69dd9f]
 8: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xc57) [0x686c47]
 9: (MDS::handle_core_message(Message*)+0x987) [0x4bedf7]
 10: (MDS::_dispatch(Message*)+0x2f) [0x4bef8f]
 11: (MDS::ms_dispatch(Message*)+0x70) [0x4c06f0]
 12: (SimpleMessenger::dispatch_entry()+0x833) [0x6edd13]
 13: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x49ed7c]
 14: (()+0x7efc) [0x7f2f348f0efc]
 15: (clone()+0x6d) [0x7f2f3332a89d]
 ceph version 0.37-192-g1a4eec2 (commit:1a4eec20a345ced993a48012aaaa8d8ca344a1ba)
 1: (InoTable::apply_alloc_id(inodeno_t)+0x441) [0x647041]
 2: (Server::apply_allocated_inos(MDRequest*)+0x4dd) [0x509f3d]
 3: (Server::reply_request(MDRequest*, MClientReply*, CInode*, CDentry*)+0x83) [0x50a283]
 4: (C_MDS_mknod_finish::finish(int)+0xfe) [0x53686e]
 5: (Context::complete(int)+0xa) [0x4a4d7a]
 6: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0xc8) [0x4c3568]
 7: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x18f) [0x69dd9f]
 8: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xc57) [0x686c47]
 9: (MDS::handle_core_message(Message*)+0x987) [0x4bedf7]
 10: (MDS::_dispatch(Message*)+0x2f) [0x4bef8f]
 11: (MDS::ms_dispatch(Message*)+0x70) [0x4c06f0]
 12: (SimpleMessenger::dispatch_entry()+0x833) [0x6edd13]
 13: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x49ed7c]
 14: (()+0x7efc) [0x7f2f348f0efc]
 15: (clone()+0x6d) [0x7f2f3332a89d]
*** Caught signal (Aborted) **
 in thread 7f2f3102b700
 ceph version 0.37-192-g1a4eec2 (commit:1a4eec20a345ced993a48012aaaa8d8ca344a1ba)
 1: ./ceph-mds() [0x777fb6]
 2: (()+0x10060) [0x7f2f348f9060]
 3: (gsignal()+0x35) [0x7f2f3327f3a5]
 4: (abort()+0x17b) [0x7f2f33282b0b]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f2f33b3dd7d]
 6: (()+0xb9f26) [0x7f2f33b3bf26]
 7: (()+0xb9f53) [0x7f2f33b3bf53]
 8: (()+0xba04e) [0x7f2f33b3c04e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x193) [0x6fedf3]
 10: (InoTable::apply_alloc_id(inodeno_t)+0x441) [0x647041]
 11: (Server::apply_allocated_inos(MDRequest*)+0x4dd) [0x509f3d]
 12: (Server::reply_request(MDRequest*, MClientReply*, CInode*, CDentry*)+0x83) [0x50a283]
 13: (C_MDS_mknod_finish::finish(int)+0xfe) [0x53686e]
 14: (Context::complete(int)+0xa) [0x4a4d7a]
 15: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0xc8) [0x4c3568]
 16: (Journaler::_finish_flush(int, unsigned long, utime_t)+0x18f) [0x69dd9f]
 17: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xc57) [0x686c47]
 18: (MDS::handle_core_message(Message*)+0x987) [0x4bedf7]
 19: (MDS::_dispatch(Message*)+0x2f) [0x4bef8f]
 20: (MDS::ms_dispatch(Message*)+0x70) [0x4c06f0]
 21: (SimpleMessenger::dispatch_entry()+0x833) [0x6edd13]
 22: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x49ed7c]
 23: (()+0x7efc) [0x7f2f348f0efc]
 24: (clone()+0x6d) [0x7f2f3332a89d]
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux