master osd crash during scrub pg or scrub pg manually

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Good evening everyone.
My ceph is cross-compiled and runs on armv7l 32-bit development board.The ceph version is 10.2.3,The compiler version is 6.3.0.
After I placed an object in the rados cluster, I scrubed the object manually. At this time, the main osd crashed.
Here is the osd log:

 ceph version  ()
 1: (()+0x7a7de8) [0x7fd1dde8]
 2: (__default_sa_restorer()+0) [0xb68db3c0]
 3: (()+0x24309c) [0x7f7b909c]
 4: (std::_Rb_tree_iterator<std::pair<hobject_t const, ScrubMap::object> > std::_Rb_tree<hobject_t, std::pair<hobject_t const, ScrubMap::object>, std::_Select1st<std::pair<hobject_t const, ScrubMap::object> >, hobject_t::BitwiseComparator, std::allocator<std::pair<hobject_t const, ScrubMap::object> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<hobject_t const&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<hobject_t const, ScrubMap::object> >, std::piecewise_construct_t const&, std::tuple<hobject_t const&>&&, std::tuple<>&&)+0x48) [0x7f87eed8]
 5: (ScrubMap::decode(ceph::buffer::list::iterator&, long long)+0x2b8) [0x7fa31498]
 6: (PG::sub_op_scrub_map(std::shared_ptr<OpRequest>)+0x1e8) [0x7f862db8]
 7: (ReplicatedPG::do_sub_op(std::shared_ptr<OpRequest>)+0x274) [0x7f8acb78]
 8: (ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x518) [0x7f8d201c]
 9: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3c4) [0x7f783e6c]
 10: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>&)+0x68) [0x7f78412c]
 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x5d4) [0x7f79c664]
 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x764) [0x7fe01da8]
 13: (()+0x88ea18) [0x7fe04a18]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

     0> 2018-03-16 11:26:39.186442 95fe5a30  2 -- 172.16.10.31:6800/6528 >> 172.16.10.35:6789/0 pipe(0x86236000 sd=23 :41154 s=2 pgs=174 cs=1 l=1 c=0x8631b7c0).reader got KEEPALIVE_ACK
--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
   1/ 5 compressor
   1/ 5 newstore
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   1/ 5 kinetic
   1/ 5 fuse
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.1.log


I also debugged with gdb.Here is the gdb debugging information:

[Thread 0x95636a30 (LWP 7835) exited]

Thread 50 "tp_osd_tp" received signal SIGSEGV, Segmentation fault.
0x7f85c09c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider::_Alloc_hider (__a=..., __dat=<optimized out>, this=<optimized out>)
    at /usr/include/c++/6.3.0/bits/basic_string.h:110
110 /usr/include/c++/6.3.0/bits/basic_string.h: No such file or directory.
(gdb) where
#0  0x7f85c09c in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider::_Alloc_hider (__a=..., __dat=<optimized out>, this=<optimized out>)
    at /usr/include/c++/6.3.0/bits/basic_string.h:110
#1  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string (__str=..., this=<optimized out>) at /usr/include/c++/6.3.0/bits/basic_string.h:399
#2  object_t::object_t (this=<optimized out>) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/include/object.h:32
#3  hobject_t::hobject_t (this=0x859a1850, rhs=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/common/hobject.h:97
#4  0x7f921ed8 in std::pair<hobject_t const, ScrubMap::object>::pair<hobject_t const&, 0u>(std::tuple<hobject_t const&>&, std::tuple<>&, std::_Index_tuple<0u>, std::_Index_tuple<>) (
    __tuple2=<synthetic pointer>..., __tuple1=..., this=0x859a1850) at /usr/include/c++/6.3.0/tuple:1586
#5  std::pair<hobject_t const, ScrubMap::object>::pair<hobject_t const&>(std::piecewise_construct_t, std::tuple<hobject_t const&>, std::tuple<>) (__second=..., __first=...,
    this=0x859a1850) at /usr/include/c++/6.3.0/tuple:1575
#6  __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<hobject_t const, ScrubMap::object> > >::construct<std::pair<hobject_t const, ScrubMap::object>, std::piecewise_construct_t const&, std::tuple<hobject_t const&>, std::tuple<> >(std::pair<hobject_t const, ScrubMap::object>*, std::piecewise_construct_t const&, std::tuple<hobject_t const&>&&, std::tuple<>&&) (
    this=<optimized out>, __p=0x859a1850) at /usr/include/c++/6.3.0/ext/new_allocator.h:120
#7  std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<hobject_t const, ScrubMap::object> > > >::construct<std::pair<hobject_t const, ScrubMap::object>, std::piecewise_construct_t const&, std::tuple<hobject_t const&>, std::tuple<> >(std::allocator<std::_Rb_tree_node<std::pair<hobject_t const, ScrubMap::object> > >&, std::pair<hobject_t const, ScrubMap::object>*, std::piecewise_construct_t const&, std::tuple<hobject_t const&>&&, std::tuple<>&&) (__a=..., __p=<optimized out>) at /usr/include/c++/6.3.0/bits/alloc_traits.h:455
#8  std::_Rb_tree<hobject_t, std::pair<hobject_t const, ScrubMap::object>, std::_Select1st<std::pair<hobject_t const, ScrubMap::object> >, hobject_t::BitwiseComparator, std::allocator<std::pair<hobject_t const, ScrubMap::object> > >::_M_construct_node<std::piecewise_construct_t const&, std::tuple<hobject_t const&>, std::tuple<> >(std::_Rb_tree_node<std::pair<hobject_t const, ScrubMap::object> >*, std::piecewise_construct_t const&, std::tuple<hobject_t const&>&&, std::tuple<>&&) (this=0x85d4f808, __node=0x859a1840) at /usr/include/c++/6.3.0/bits/stl_tree.h:543
#9  std::_Rb_tree<hobject_t, std::pair<hobject_t const, ScrubMap::object>, std::_Select1st<std::pair<hobject_t const, ScrubMap::object> >, hobject_t::BitwiseComparator, std::allocator<std::pair<hobject_t const, ScrubMap::object> > >::_M_create_node<std::piecewise_construct_t const&, std::tuple<hobject_t const&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple<hobject_t const&>&&, std::tuple<>&&) (this=0x85d4f808) at /usr/include/c++/6.3.0/bits/stl_tree.h:560
#10 std::_Rb_tree<hobject_t, std::pair<hobject_t const, ScrubMap::object>, std::_Select1st<std::pair<hobject_t const, ScrubMap::object> >, hobject_t::BitwiseComparator, std::allocator<std::pair<hobject_t const, ScrubMap::object> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<hobject_t const&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<hobject_t const, ScrubMap::object> >, std::piecewise_construct_t const&, std::tuple<hobject_t const&>&&, std::tuple<>&&) (this=0x85d4f808, __pos=..., __args#0=...,
    __args#1=<unknown type in /usr/bin/ceph-osd, CU 0xdd4ce7, DIE 0x100304a>, __args#2=<unknown type in /usr/bin/ceph-osd, CU 0xdd4ce7, DIE 0x10579ff>)
    at /usr/include/c++/6.3.0/bits/stl_tree.h:2196
#11 0x7fad4498 in std::map<hobject_t, ScrubMap::object, hobject_t::BitwiseComparator, std::allocator<std::pair<hobject_t const, ScrubMap::object> > >::operator[] (__k=..., this=0x8616fb14)
    at /usr/include/c++/6.3.0/bits/stl_map.h:483
#12 decode<hobject_t, ScrubMap::object, hobject_t::BitwiseComparator> (p=..., m=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/include/encoding.h:660
#13 ScrubMap::decode (this=0x8616fb14, bl=..., pool=-7137260651382875116) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/osd/osd_types.cc:5282
#14 0x7f905db8 in PG::sub_op_scrub_map (this=this@entry=0x85d9c000, op=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/osd/PG.cc:3485
#15 0x7f94fb78 in ReplicatedPG::do_sub_op (this=this@entry=0x85d9c000, op=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/osd/ReplicatedPG.cc:3212
#16 0x7f97501c in ReplicatedPG::do_request (this=0x85d9c000, op=..., handle=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/osd/ReplicatedPG.cc:1501
#17 0x7f826e6c in OSD::dequeue_op (this=this@entry=0x85b2e000, pg=..., op=..., handle=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/osd/OSD.cc:8815
#18 0x7f82712c in PGQueueable::RunVis::operator() (this=this@entry=0x9cf35fb0, op=...) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/osd/OSD.cc:163
#19 0x7f83f664 in boost::detail::variant::invoke_visitor<PGQueueable::RunVis>::internal_visit<std::shared_ptr<OpRequest> > (operand=..., this=<synthetic pointer>)
    at /usr/include/boost/variant/variant.hpp:1046
#20 boost::detail::variant::visitation_impl_invoke_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, std::shared_ptr<OpRequest> > (storage=0x9cf36134,
    visitor=<synthetic pointer>...) at /usr/include/boost/variant/detail/visitation_impl.hpp:114
#21 boost::detail::variant::visitation_impl_invoke<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, std::shared_ptr<OpRequest>, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub>::has_fallback_type_> (internal_which=<optimized out>, t=0x0, storage=0x9cf36134, visitor=<synthetic pointer>...)
    at /usr/include/boost/variant/detail/visitation_impl.hpp:157
#22 boost::detail::variant::visitation_impl<mpl_::int_<0>, boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<3l>, std::shared_ptr<OpRequest>, bo---Type <return> to continue, or q <return> to quit---
ost::mpl::l_item<mpl_::long_<2l>, PGSnapTrim, boost::mpl::l_item<mpl_::long_<1l>, PGScrub, boost::mpl::l_end> > > >, boost::mpl::l_iter<boost::mpl::l_end> >, boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub>::has_fallback_type_> (no_backup_flag=..., storage=0x9cf36134,
    visitor=<synthetic pointer>..., logical_which=<optimized out>, internal_which=<optimized out>) at /usr/include/boost/variant/detail/visitation_impl.hpp:238
#23 boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub>::internal_apply_visitor_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, void*> (storage=0x9cf36134,
    visitor=<synthetic pointer>..., logical_which=<optimized out>, internal_which=<optimized out>) at /usr/include/boost/variant/variant.hpp:2389
#24 boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub>::internal_apply_visitor<boost::detail::variant::invoke_visitor<PGQueueable::RunVis> > (visitor=<synthetic pointer>...,
    this=0x9cf36130) at /usr/include/boost/variant/variant.hpp:2400
#25 boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub>::apply_visitor<PGQueueable::RunVis> (visitor=..., this=0x9cf36130) at /usr/include/boost/variant/variant.hpp:2423
#26 boost::apply_visitor<PGQueueable::RunVis, boost::variant<std::shared_ptr<OpRequest>, PGSnapTrim, PGScrub> > (visitable=..., visitor=...)
    at /usr/include/boost/variant/detail/apply_visitor_unary.hpp:70
#27 PGQueueable::run (handle=..., pg=..., osd=<optimized out>, this=0x9cf36130) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/osd/OSD.h:392
#28 OSD::ShardedOpWQ::_process (this=0x85b2eebc, thread_index=<optimized out>, hb=<optimized out>) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/osd/OSD.cc:8696
#29 0x7fea4da8 in ShardedThreadPool::shardedthreadpool_worker (this=0x85b2e5a8, thread_index=2146061736) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/common/WorkQueue.cc:340
#30 0x7fea7a18 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-src/10.2.3-r0/git/src/common/WorkQueue.h:684
#31 0xb6c28f38 in start_thread (arg=0x9cf36a30) at /usr/src/debug/glibc/2.25-r0/git/nptl/pthread_create.c:458
#32 0xb68d0298 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:76 from /lib/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.

Can someone help me find bugs?
Thank you very very much


 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux