Re: Bluestore/Rocksdb panic when compiling with jemalloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 27 Feb 2017, Varada Kari wrote:
> If this problem is consistent, could you please collect the core? That
> might give some more clues and in a periodic way you can dump the
> rocksdb stats(rocksdb_collect_extended_stats and
> rocksdb_collect_memory_stats) from admin socket. They might give some
> more info about the block table cache from rocksdb.

You might also try running ceph-osd through valgrind and see if that gives 
you any clues.  I haven't seen this crash before.

sage

> 
> Varada
> 
> On Sunday 26 February 2017 05:15 PM, Xiaoxi Chen wrote:
> > Hi Sage,
> >
> >      We got repeatable segmentation fault with jemalloc building of
> > Kraken 11.2.0, on ubuntu 16.04.
> >
> >      The log looks like below, wondering if it is a bug, or a known issue ?
> >
> >
> > Xiaoxi.
> >
> >
> > 2017-02-20 16:29:51.178757 7f362de7aa40  4 rocksdb: Write Ahead Log file in db:
> >
> > 2017-02-20 16:29:51.178758 7f362de7aa40  4 rocksdb:
> >      Options.error_if_exists: 0
> > "/var/log/ceph/ceph-osd.0.log" 762535L, 161170224C
> >
> >                                           1,1           Top
> >    0/ 0 client
> >    0/ 0 osd
> >    0/ 0 optracker
> >    0/ 0 objclass
> >    0/ 0 filestore
> >    0/ 0 journal
> >    0/ 0 ms
> >    0/ 0 mon
> >    0/ 0 monc
> >    0/ 0 paxos
> >    0/ 0 tp
> >    0/ 0 auth
> >    1/ 5 crypto
> >    0/ 0 finisher
> >    0/ 0 heartbeatmap
> >    0/ 0 perfcounter
> >    0/ 0 rgw
> >    1/10 civetweb
> >    1/ 5 javaclient
> >    0/ 0 asok
> >    0/ 0 throttle
> >    0/ 0 refs
> >    1/ 5 xio
> >    1/ 5 compressor
> >    1/ 5 newstore
> >    0/ 0 bluestore
> >    0/ 0 bluefs
> >    1/ 3 bdev
> >    1/ 5 kstore
> >    0/ 0 rocksdb
> >    4/ 5 leveldb
> >    4/ 5 memdb
> >    1/ 5 kinetic
> >    1/ 5 fuse
> >    1/ 5 mgr
> >    1/ 5 mgrc
> >    1/ 5 dpdk
> >   -2/-2 (syslog threshold)
> >   -1/-1 (stderr threshold)
> >   max_recent     10000
> >   max_new         1000
> >   log_file /var/log/ceph/ceph-osd.0.log
> > --- end dump of recent events ---
> >
> >
> >                                           762535,1      Bot
> >  13: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x54e)
> > [0x55a0f73b62be]
> >  14: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> > ObjectStore::Transaction*)+0x903) [0x55a0f73d4b23]
> >  15: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
> > std::vector<ObjectStore::Transaction,
> > std::allocator<ObjectStore::Transaction> >&,
> > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x436)
> > [0x55a0f73d6e66]
> >  16: (ObjectStore::queue_transaction(ObjectStore::Sequencer*,
> > ObjectStore::Transaction&&, Context*, Context*, Context*,
> > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x1ab)
> > [0x55a0f6fe0f1b]
> >  17: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&,
> > std::shared_ptr<OpRequest>)+0x6a) [0x55a0f716072a]
> >  18: (ReplicatedBackend::_do_push(std::shared_ptr<OpRequest>)+0x545)
> > [0x55a0f7241a15]
> >  19: (ReplicatedBackend::handle_message(std::shared_ptr<OpRequest>)+0x320)
> > [0x55a0f7250c60]
> >  20: (PrimaryLogPG::do_request(std::shared_ptr<OpRequest>&,
> > ThreadPool::TPHandle&)+0xbd) [0x55a0f70f708d]
> >  21: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> > std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x418)
> > [0x55a0f6f91ce8]
> >  22: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>
> > const&)+0x52) [0x55a0f6f91f42]
> >  23: (OSD::ShardedOpWQ::_process(unsigned int,
> > ceph::heartbeat_handle_d*)+0x776) [0x55a0f6fb76e6]
> >  24: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7f9)
> > [0x55a0f769d129]
> >  25: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a0f76a04e0]
> >  26: (()+0x76ba) [0x7f8e1c0c66ba]
> >  27: (clone()+0x6d) [0x7f8e1a79582d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> > needed to interpret this.
> >
> > --- logging levels ---
> >    0/ 5 none
> >    0/ 0 lockdep
> >    0/ 0 context
> >    0/ 0 crush
> >    0/ 0 mds
> >    0/ 0 mds_balancer
> >    0/ 0 mds_locker
> >    0/ 0 mds_log
> >    0/ 0 mds_log_expire
> >    0/ 0 mds_migrator
> >    0/ 0 buffer
> >    0/ 0 timer
> >    0/ 0 filer
> >    0/ 1 striper
> >    0/ 0 objecter
> >    0/ 0 rados
> >    0/ 0 rbd
> >    0/ 5 rbd_mirror
> >    0/ 5 rbd_replay
> >    0/ 0 journaler
> >    0/ 0 objectcacher
> >    0/ 0 client
> >    0/ 0 osd
> >    0/ 0 optracker
> >
> >
> >                                           762494,4      99%
> >    -18> 2017-02-25 19:32:12.622925 7f8de53b8700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >    -17> 2017-02-25 19:32:12.623455 7f8de53b8700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 2
> >    -16> 2017-02-25 19:32:12.623645 7f8de53b8700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >    -15> 2017-02-25 19:32:12.664082 7f8de13b0700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >    -14> 2017-02-25 19:32:12.664274 7f8de13b0700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >    -13> 2017-02-25 19:32:12.664461 7f8de13b0700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >    -12> 2017-02-25 19:32:12.664645 7f8de13b0700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >    -11> 2017-02-25 19:32:12.664829 7f8de13b0700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >    -10> 2017-02-25 19:32:12.740853 7f8de03ae700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -9> 2017-02-25 19:32:12.741046 7f8de03ae700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -8> 2017-02-25 19:32:12.741238 7f8de03ae700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -7> 2017-02-25 19:32:12.779708 7f8de6bbb700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -6> 2017-02-25 19:32:12.779910 7f8de6bbb700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -5> 2017-02-25 19:32:12.815901 7f8de03ae700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -4> 2017-02-25 19:32:12.816090 7f8de03ae700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -3> 2017-02-25 19:32:12.859204 7f8de43b6700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -2> 2017-02-25 19:32:12.859401 7f8de43b6700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >     -1> 2017-02-25 19:32:12.859588 7f8de43b6700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >      0> 2017-02-25 19:32:12.898476 7f8de43b6700 -1 *** Caught signal
> > (Segmentation fault) **
> >  in thread 7f8de43b6700 thread_name:tp_osd_tp
> >
> >  ceph version 11.2.0-41-g6c6b185 (6c6b185bab1e0b7d7446b97d5d314b4dd60360ff)
> >  1: (()+0x946e4e) [0x55a0f74c6e4e]
> >  2: (()+0x11390) [0x7f8e1c0d0390]
> >  3: (()+0x1f8af) [0x7f8e1ce458af]
> >  4: (rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice
> > const&, rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*,
> > rocksdb::ReadOptions const&, rocksdb::ImmutableCFOptions const&,
> > rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*,
> > rocksdb::Block*, unsigned int, rocksdb::Slice const&, unsigned
> > long)+0x1ce) [0x55a0f758380e]
> >  5: (rocksdb::BlockBasedTable::MaybeLoadDataBlockToCache(rocksdb::BlockBasedTable::Rep*,
> > rocksdb::ReadOptions const&, rocksdb::BlockHandle const&,
> > rocksdb::Slice,
> > rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*)+0x3ab)
> > [0x55a0f7584feb]
> >  6: (rocksdb::BlockBasedTable::NewDataBlockIterator(rocksdb::BlockBasedTable::Rep*,
> > rocksdb::ReadOptions const&, rocksdb::Slice const&,
> > rocksdb::BlockIter*)+0x301) [0x55a0f75853b1]
> >  7: (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&,
> > rocksdb::Slice const&, rocksdb::GetContext*, bool)+0x5cb)
> > [0x55a0f758b17b]
> >  8: (rocksdb::TableCache::Get(rocksdb::ReadOptions const&,
> > rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&,
> > rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::HistogramImpl*,
> > bool, int)+0x2ee) [0x55a0f754817e]
> >  9: (rocksdb::Version::Get(rocksdb::ReadOptions const&,
> > rocksdb::LookupKey const&, std::__cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> >*, rocksdb::Status*,
> > rocksdb::MergeContext*, rocksdb::RangeDelAggregator*, bool*, bool*,
> > unsigned long*)+0x417) [0x55a0f755ed77]
> >  10: (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&,
> > rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&,
> > std::__cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> >*, bool*)+0x324) [0x55a0f74e68b4]
> >  11: (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&,
> > rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&,
> > std::__cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> >*)+0x22) [0x55a0f74e7002]
> >  12: (RocksDBStore::get(std::__cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> > const&,
> > std::__cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> > const&, ceph::buffer::list*)+0x15d)
> > [0x55a0f740c20d]
> >  13: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x54e)
> > [0x55a0f73b62be]
> >  14: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> > ObjectStore::Transaction*)+0x903) [0x55a0f73d4b23]
> >
> >
> >                                           762455,2      99%
> >  13: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x54e)
> > [0x55a0f73b62be]
> >  14: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> > ObjectStore::Transaction*)+0x903) [0x55a0f73d4b23]
> >  15: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
> > std::vector<ObjectStore::Transaction,
> > std::allocator<ObjectStore::Transaction> >&,
> > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x436)
> > [0x55a0f73d6e66]
> >  16: (ObjectStore::queue_transaction(ObjectStore::Sequencer*,
> > ObjectStore::Transaction&&, Context*, Context*, Context*,
> > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x1ab)
> > [0x55a0f6fe0f1b]
> >  17: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&,
> > std::shared_ptr<OpRequest>)+0x6a) [0x55a0f716072a]
> >  18: (ReplicatedBackend::_do_push(std::shared_ptr<OpRequest>)+0x545)
> > [0x55a0f7241a15]
> >  19: (ReplicatedBackend::handle_message(std::shared_ptr<OpRequest>)+0x320)
> > [0x55a0f7250c60]
> >  20: (PrimaryLogPG::do_request(std::shared_ptr<OpRequest>&,
> > ThreadPool::TPHandle&)+0xbd) [0x55a0f70f708d]
> >  21: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> > std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x418)
> > [0x55a0f6f91ce8]
> >  22: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>
> > const&)+0x52) [0x55a0f6f91f42]
> >  23: (OSD::ShardedOpWQ::_process(unsigned int,
> > ceph::heartbeat_handle_d*)+0x776) [0x55a0f6fb76e6]
> >  24: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7f9)
> > [0x55a0f769d129]
> >  25: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a0f76a04e0]
> >  26: (()+0x76ba) [0x7f8e1c0c66ba]
> >  27: (clone()+0x6d) [0x7f8e1a79582d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> > needed to interpret this.
> >
> > --- logging levels ---
> >    0/ 5 none
> >    0/ 0 lockdep
> >    0/ 0 context
> >    0/ 0 crush
> >    0/ 0 mds
> >    0/ 0 mds_balancer
> >    0/ 0 mds_locker
> >    0/ 0 mds_log
> >    0/ 0 mds_log_expire
> >    0/ 0 mds_migrator
> >    0/ 0 buffer
> >    0/ 0 timer
> >    0/ 0 filer
> >    0/ 1 striper
> >    0/ 0 objecter
> >    0/ 0 rados
> >    0/ 0 rbd
> >    0/ 5 rbd_mirror
> >    0/ 5 rbd_replay
> >    0/ 0 journaler
> >    0/ 0 objectcacher
> >    0/ 0 client
> >    0/ 0 osd
> >    0/ 0 optracker
> >
> >
> >                                           762454,2      99%
> >     -1> 2017-02-25 19:32:12.859588 7f8de43b6700 -1
> > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1
> >      0> 2017-02-25 19:32:12.898476 7f8de43b6700 -1 *** Caught signal
> > (Segmentation fault) **
> >  in thread 7f8de43b6700 thread_name:tp_osd_tp
> >
> >  ceph version 11.2.0-41-g6c6b185 (6c6b185bab1e0b7d7446b97d5d314b4dd60360ff)
> >  1: (()+0x946e4e) [0x55a0f74c6e4e]
> >  2: (()+0x11390) [0x7f8e1c0d0390]
> >  3: (()+0x1f8af) [0x7f8e1ce458af]
> >  4: (rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice
> > const&, rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*,
> > rocksdb::ReadOptions const&, rocksdb::ImmutableCFOptions const&,
> > rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*,
> > rocksdb::Block*, unsigned int, rocksdb::Slice const&, unsigned
> > long)+0x1ce) [0x55a0f758380e]
> >  5: (rocksdb::BlockBasedTable::MaybeLoadDataBlockToCache(rocksdb::BlockBasedTable::Rep*,
> > rocksdb::ReadOptions const&, rocksdb::BlockHandle const&,
> > rocksdb::Slice,
> > rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*)+0x3ab)
> > [0x55a0f7584feb]
> >  6: (rocksdb::BlockBasedTable::NewDataBlockIterator(rocksdb::BlockBasedTable::Rep*,
> > rocksdb::ReadOptions const&, rocksdb::Slice const&,
> > rocksdb::BlockIter*)+0x301) [0x55a0f75853b1]
> >  7: (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&,
> > rocksdb::Slice const&, rocksdb::GetContext*, bool)+0x5cb)
> > [0x55a0f758b17b]
> >  8: (rocksdb::TableCache::Get(rocksdb::ReadOptions const&,
> > rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&,
> > rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::HistogramImpl*,
> > bool, int)+0x2ee) [0x55a0f754817e]
> >  9: (rocksdb::Version::Get(rocksdb::ReadOptions const&,
> > rocksdb::LookupKey const&, std::__cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> >*, rocksdb::Status*,
> > rocksdb::MergeContext*, rocksdb::RangeDelAggregator*, bool*, bool*,
> > unsigned long*)+0x417) [0x55a0f755ed77]
> >  10: (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&,
> > rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&,
> > std::__cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> >*, bool*)+0x324) [0x55a0f74e68b4]
> >  11: (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&,
> > rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&,
> > std::__cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> >*)+0x22) [0x55a0f74e7002]
> >  12: (RocksDBStore::get(std::__cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> > const&,
> > std::__cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> > const&, ceph::buffer::list*)+0x15d)
> > [0x55a0f740c20d]
> >  13: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x54e)
> > [0x55a0f73b62be]
> >  14: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> > ObjectStore::Transaction*)+0x903) [0x55a0f73d4b23]
> >  15: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
> > std::vector<ObjectStore::Transaction,
> > std::allocator<ObjectStore::Transaction> >&,
> > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x436)
> > [0x55a0f73d6e66]
> >  16: (ObjectStore::queue_transaction(ObjectStore::Sequencer*,
> > ObjectStore::Transaction&&, Context*, Context*, Context*,
> > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x1ab)
> > [0x55a0f6fe0f1b]
> >  17: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&,
> > std::shared_ptr<OpRequest>)+0x6a) [0x55a0f716072a]
> >  18: (ReplicatedBackend::_do_push(std::shared_ptr<OpRequest>)+0x545)
> > [0x55a0f7241a15]
> >  19: (ReplicatedBackend::handle_message(std::shared_ptr<OpRequest>)+0x320)
> > [0x55a0f7250c60]
> >  20: (PrimaryLogPG::do_request(std::shared_ptr<OpRequest>&,
> > ThreadPool::TPHandle&)+0xbd) [0x55a0f70f708d]
> >  21: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> > std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x418)
> > [0x55a0f6f91ce8]
> >  22: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>
> > const&)+0x52) [0x55a0f6f91f42]
> >  23: (OSD::ShardedOpWQ::_process(unsigned int,
> > ceph::heartbeat_handle_d*)+0x776) [0x55a0f6fb76e6]
> >  24: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7f9)
> > [0x55a0f769d129]
> >  25: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a0f76a04e0]
> >  26: (()+0x76ba) [0x7f8e1c0c66ba]
> >  27: (clone()+0x6d) [0x7f8e1a79582d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> > needed to interpret this.
> >
> > --- logging levels ---
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux