Hi, thanks for your help. I checked the version of both my ceph and ceph-debuginfo package are the same. Is there any other possible cause? Thank you:-) At 2016-11-20 15:40:29, "huang jun" <hjwsm1989@xxxxxxxxx> wrote: >For first question, you can reinstall the ceph-debuginfo package >released with your ceph package. >for the assert problem, you can create an issue to track this >http://tracker.ceph.com/projects/ceph/issues > > >2016-11-20 15:29 GMT+08:00 xxhdx1985126 <xxhdx1985126@xxxxxxx>: >> >> No, how to verify it? And do you have any clue what made that assert fail? Thank you >> >> >> >> >> >> >> >> >> >> At 2016-11-20 15:28:26, "huang jun" <hjwsm1989@xxxxxxxxx> wrote: >>>seems like the ceph and ceph-debuginfo package version not match, do >>>you verified it? >>> >>>2016-11-20 15:20 GMT+08:00 xxhdx1985126 <xxhdx1985126@xxxxxxx>: >>>> In my test today, the same problem came up even there is no such warning.... >>>> >>>> By the way, the problem of ceph that I want to fix is as such: some of my osd can't finish the recovery+backfilling process due to the failure of the following assert: >>>> >>>> 2016-11-19 07:00:49.133814 7fc7a77ff700 -1 error_msg osd/ReplicatedPG.cc: In function 'void ReplicatedPG::wait_for_unreadable_object(const hobject_t&, OpRequestRef)' thread 7fc7a77ff700 time 2016-11-19 07:00:48.914231 >>>> osd/ReplicatedPG.cc: 387: FAILED assert(needs_recovery) >>>> >>>> ceph version 0.94.5-12-g83f56a1 (83f56a1c84e3dbd95a4c394335a7b1dc926dd1c4) >>>> 1: (ReplicatedPG::wait_for_unreadable_object(hobject_t const&, std::tr1::shared_ptr<OpRequest>)+0x3f5) [0x8b5a65] >>>> 2: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>&)+0x5e9) [0x8f0c79] >>>> 3: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x4e3) [0x87fdc3] >>>> 4: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x178) [0x66b3f8] >>>> 5: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x59e) [0x66f8ee] >>>> 6: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x795) [0xa76d85] >>>> 7: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xa7a610] >>>> 8: /lib64/libpthread.so.0() [0x393da07a51] >>>> 9: (clone()+0x6d) [0x393d6e893d] >>>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. >>>> >>>> I'm using ceph-0.94.5 which should be the version "Hammer". >>>> Do you have any clue about what made this assert fail? >>>> >>>> >>>> At 2016-11-20 09:51:47, "huang jun" <hjwsm1989@xxxxxxxxx> wrote: >>>>>that maybe the reason, do you have the same problem if there is no such warning? >>>>> >>>>>2016-11-19 19:00 GMT+08:00 xxhdx1985126 <xxhdx1985126@xxxxxxx>: >>>>>> >>>>>> Hi, everyone. >>>>>> >>>>>> >>>>>> I'm trying to fix a problem in ceph using its core file and gdb. >>>>>> gdb successfully loaded debug symbol from ceph-debuginfo: >>>>>> >>>>>> >>>>>> Reading symbols from /usr/bin/ceph-osd...Reading symbols from /usr/lib/debug/usr/bin/ceph-osd.debug...done. >>>>>> >>>>>> >>>>>> However, it still can't find the symbol table when I use "bt" to trace the stack: >>>>>> >>>>>> >>>>>> #0 0x000000393da0f65b in ?? () >>>>>> No symbol table info available. >>>>>> #1 0x0000000000a51636 in install_standard_sighandlers () at global/signal_handler.cc:121 >>>>>> No locals. >>>>>> #2 0x00007fc7a77f9ed0 in ?? () >>>>>> No symbol table info available. >>>>>> #3 0x00007fc7a77f9e10 in ?? () >>>>>> No symbol table info available. >>>>>> #4 0x00007fc7a77f9b90 in ?? () >>>>>> No symbol table info available. >>>>>> #5 0x00007fc66d3142e0 in ?? () >>>>>> No symbol table info available. >>>>>> #6 0x00007fc7fac64100 in ?? () >>>>>> No symbol table info available. >>>>>> #7 0x0000003900000000 in ?? () >>>>>> No symbol table info available. >>>>>> #8 0x0000000000a51155 in SignalHandler::unregister_handler (this=0x1105440, signum=<value optimized out>, handler=<value optimized out>) at global/signal_handler.cc:317 >>>>>> No locals. >>>>>> #9 0x000000393eabcc33 in ?? () >>>>>> No symbol table info available. >>>>>> #10 0x000000393eabcd2e in ?? () >>>>>> No symbol table info available. >>>>>> >>>>>> >>>>>> Why is this happening? >>>>>> >>>>>> >>>>>> PS: when gdb started running, it prompted the following warning: >>>>>> >>>>>> >>>>>> BFD: Warning: /home/xuxuehan/online_problems.2016-11-19.7-01/core-ceph-osd-6-32337-32337-19906-1479510049 is truncated: expected core file size >= 8372899840, found: 7439335424 >>>>>> >>>>>> >>>>>> Could this be the cause of gdb not finding the symbol table? >>>>> >>>>> >>>>> >>>>>-- >>>>>Thank you! >>>>>HuangJun >>> >>> >>> >>>-- >>>Thank you! >>>HuangJun >> >> >> >> > > > >-- >Thank you! >HuangJun ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f