On Mon, 27 Feb 2017, Varada Kari wrote: > If this problem is consistent, could you please collect the core? That > might give some more clues and in a periodic way you can dump the > rocksdb stats(rocksdb_collect_extended_stats and > rocksdb_collect_memory_stats) from admin socket. They might give some > more info about the block table cache from rocksdb. You might also try running ceph-osd through valgrind and see if that gives you any clues. I haven't seen this crash before. sage > > Varada > > On Sunday 26 February 2017 05:15 PM, Xiaoxi Chen wrote: > > Hi Sage, > > > > We got repeatable segmentation fault with jemalloc building of > > Kraken 11.2.0, on ubuntu 16.04. > > > > The log looks like below, wondering if it is a bug, or a known issue ? > > > > > > Xiaoxi. > > > > > > 2017-02-20 16:29:51.178757 7f362de7aa40 4 rocksdb: Write Ahead Log file in db: > > > > 2017-02-20 16:29:51.178758 7f362de7aa40 4 rocksdb: > > Options.error_if_exists: 0 > > "/var/log/ceph/ceph-osd.0.log" 762535L, 161170224C > > > > 1,1 Top > > 0/ 0 client > > 0/ 0 osd > > 0/ 0 optracker > > 0/ 0 objclass > > 0/ 0 filestore > > 0/ 0 journal > > 0/ 0 ms > > 0/ 0 mon > > 0/ 0 monc > > 0/ 0 paxos > > 0/ 0 tp > > 0/ 0 auth > > 1/ 5 crypto > > 0/ 0 finisher > > 0/ 0 heartbeatmap > > 0/ 0 perfcounter > > 0/ 0 rgw > > 1/10 civetweb > > 1/ 5 javaclient > > 0/ 0 asok > > 0/ 0 throttle > > 0/ 0 refs > > 1/ 5 xio > > 1/ 5 compressor > > 1/ 5 newstore > > 0/ 0 bluestore > > 0/ 0 bluefs > > 1/ 3 bdev > > 1/ 5 kstore > > 0/ 0 rocksdb > > 4/ 5 leveldb > > 4/ 5 memdb > > 1/ 5 kinetic > > 1/ 5 fuse > > 1/ 5 mgr > > 1/ 5 mgrc > > 1/ 5 dpdk > > -2/-2 (syslog threshold) > > -1/-1 (stderr threshold) > > max_recent 10000 > > max_new 1000 > > log_file /var/log/ceph/ceph-osd.0.log > > --- end dump of recent events --- > > > > > > 762535,1 Bot > > 13: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x54e) > > [0x55a0f73b62be] > > 14: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, > > ObjectStore::Transaction*)+0x903) [0x55a0f73d4b23] > > 15: (BlueStore::queue_transactions(ObjectStore::Sequencer*, > > std::vector<ObjectStore::Transaction, > > std::allocator<ObjectStore::Transaction> >&, > > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x436) > > [0x55a0f73d6e66] > > 16: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, > > ObjectStore::Transaction&&, Context*, Context*, Context*, > > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x1ab) > > [0x55a0f6fe0f1b] > > 17: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, > > std::shared_ptr<OpRequest>)+0x6a) [0x55a0f716072a] > > 18: (ReplicatedBackend::_do_push(std::shared_ptr<OpRequest>)+0x545) > > [0x55a0f7241a15] > > 19: (ReplicatedBackend::handle_message(std::shared_ptr<OpRequest>)+0x320) > > [0x55a0f7250c60] > > 20: (PrimaryLogPG::do_request(std::shared_ptr<OpRequest>&, > > ThreadPool::TPHandle&)+0xbd) [0x55a0f70f708d] > > 21: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > > std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x418) > > [0x55a0f6f91ce8] > > 22: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> > > const&)+0x52) [0x55a0f6f91f42] > > 23: (OSD::ShardedOpWQ::_process(unsigned int, > > ceph::heartbeat_handle_d*)+0x776) [0x55a0f6fb76e6] > > 24: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7f9) > > [0x55a0f769d129] > > 25: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a0f76a04e0] > > 26: (()+0x76ba) [0x7f8e1c0c66ba] > > 27: (clone()+0x6d) [0x7f8e1a79582d] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > > needed to interpret this. > > > > --- logging levels --- > > 0/ 5 none > > 0/ 0 lockdep > > 0/ 0 context > > 0/ 0 crush > > 0/ 0 mds > > 0/ 0 mds_balancer > > 0/ 0 mds_locker > > 0/ 0 mds_log > > 0/ 0 mds_log_expire > > 0/ 0 mds_migrator > > 0/ 0 buffer > > 0/ 0 timer > > 0/ 0 filer > > 0/ 1 striper > > 0/ 0 objecter > > 0/ 0 rados > > 0/ 0 rbd > > 0/ 5 rbd_mirror > > 0/ 5 rbd_replay > > 0/ 0 journaler > > 0/ 0 objectcacher > > 0/ 0 client > > 0/ 0 osd > > 0/ 0 optracker > > > > > > 762494,4 99% > > -18> 2017-02-25 19:32:12.622925 7f8de53b8700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -17> 2017-02-25 19:32:12.623455 7f8de53b8700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 2 > > -16> 2017-02-25 19:32:12.623645 7f8de53b8700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -15> 2017-02-25 19:32:12.664082 7f8de13b0700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -14> 2017-02-25 19:32:12.664274 7f8de13b0700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -13> 2017-02-25 19:32:12.664461 7f8de13b0700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -12> 2017-02-25 19:32:12.664645 7f8de13b0700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -11> 2017-02-25 19:32:12.664829 7f8de13b0700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -10> 2017-02-25 19:32:12.740853 7f8de03ae700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -9> 2017-02-25 19:32:12.741046 7f8de03ae700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -8> 2017-02-25 19:32:12.741238 7f8de03ae700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -7> 2017-02-25 19:32:12.779708 7f8de6bbb700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -6> 2017-02-25 19:32:12.779910 7f8de6bbb700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -5> 2017-02-25 19:32:12.815901 7f8de03ae700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -4> 2017-02-25 19:32:12.816090 7f8de03ae700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -3> 2017-02-25 19:32:12.859204 7f8de43b6700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -2> 2017-02-25 19:32:12.859401 7f8de43b6700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > -1> 2017-02-25 19:32:12.859588 7f8de43b6700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > 0> 2017-02-25 19:32:12.898476 7f8de43b6700 -1 *** Caught signal > > (Segmentation fault) ** > > in thread 7f8de43b6700 thread_name:tp_osd_tp > > > > ceph version 11.2.0-41-g6c6b185 (6c6b185bab1e0b7d7446b97d5d314b4dd60360ff) > > 1: (()+0x946e4e) [0x55a0f74c6e4e] > > 2: (()+0x11390) [0x7f8e1c0d0390] > > 3: (()+0x1f8af) [0x7f8e1ce458af] > > 4: (rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice > > const&, rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, > > rocksdb::ReadOptions const&, rocksdb::ImmutableCFOptions const&, > > rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, > > rocksdb::Block*, unsigned int, rocksdb::Slice const&, unsigned > > long)+0x1ce) [0x55a0f758380e] > > 5: (rocksdb::BlockBasedTable::MaybeLoadDataBlockToCache(rocksdb::BlockBasedTable::Rep*, > > rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, > > rocksdb::Slice, > > rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*)+0x3ab) > > [0x55a0f7584feb] > > 6: (rocksdb::BlockBasedTable::NewDataBlockIterator(rocksdb::BlockBasedTable::Rep*, > > rocksdb::ReadOptions const&, rocksdb::Slice const&, > > rocksdb::BlockIter*)+0x301) [0x55a0f75853b1] > > 7: (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, > > rocksdb::Slice const&, rocksdb::GetContext*, bool)+0x5cb) > > [0x55a0f758b17b] > > 8: (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, > > rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, > > rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::HistogramImpl*, > > bool, int)+0x2ee) [0x55a0f754817e] > > 9: (rocksdb::Version::Get(rocksdb::ReadOptions const&, > > rocksdb::LookupKey const&, std::__cxx11::basic_string<char, > > std::char_traits<char>, std::allocator<char> >*, rocksdb::Status*, > > rocksdb::MergeContext*, rocksdb::RangeDelAggregator*, bool*, bool*, > > unsigned long*)+0x417) [0x55a0f755ed77] > > 10: (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, > > rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, > > std::__cxx11::basic_string<char, std::char_traits<char>, > > std::allocator<char> >*, bool*)+0x324) [0x55a0f74e68b4] > > 11: (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, > > rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, > > std::__cxx11::basic_string<char, std::char_traits<char>, > > std::allocator<char> >*)+0x22) [0x55a0f74e7002] > > 12: (RocksDBStore::get(std::__cxx11::basic_string<char, > > std::char_traits<char>, std::allocator<char> > const&, > > std::__cxx11::basic_string<char, std::char_traits<char>, > > std::allocator<char> > const&, ceph::buffer::list*)+0x15d) > > [0x55a0f740c20d] > > 13: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x54e) > > [0x55a0f73b62be] > > 14: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, > > ObjectStore::Transaction*)+0x903) [0x55a0f73d4b23] > > > > > > 762455,2 99% > > 13: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x54e) > > [0x55a0f73b62be] > > 14: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, > > ObjectStore::Transaction*)+0x903) [0x55a0f73d4b23] > > 15: (BlueStore::queue_transactions(ObjectStore::Sequencer*, > > std::vector<ObjectStore::Transaction, > > std::allocator<ObjectStore::Transaction> >&, > > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x436) > > [0x55a0f73d6e66] > > 16: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, > > ObjectStore::Transaction&&, Context*, Context*, Context*, > > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x1ab) > > [0x55a0f6fe0f1b] > > 17: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, > > std::shared_ptr<OpRequest>)+0x6a) [0x55a0f716072a] > > 18: (ReplicatedBackend::_do_push(std::shared_ptr<OpRequest>)+0x545) > > [0x55a0f7241a15] > > 19: (ReplicatedBackend::handle_message(std::shared_ptr<OpRequest>)+0x320) > > [0x55a0f7250c60] > > 20: (PrimaryLogPG::do_request(std::shared_ptr<OpRequest>&, > > ThreadPool::TPHandle&)+0xbd) [0x55a0f70f708d] > > 21: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > > std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x418) > > [0x55a0f6f91ce8] > > 22: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> > > const&)+0x52) [0x55a0f6f91f42] > > 23: (OSD::ShardedOpWQ::_process(unsigned int, > > ceph::heartbeat_handle_d*)+0x776) [0x55a0f6fb76e6] > > 24: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7f9) > > [0x55a0f769d129] > > 25: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a0f76a04e0] > > 26: (()+0x76ba) [0x7f8e1c0c66ba] > > 27: (clone()+0x6d) [0x7f8e1a79582d] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > > needed to interpret this. > > > > --- logging levels --- > > 0/ 5 none > > 0/ 0 lockdep > > 0/ 0 context > > 0/ 0 crush > > 0/ 0 mds > > 0/ 0 mds_balancer > > 0/ 0 mds_locker > > 0/ 0 mds_log > > 0/ 0 mds_log_expire > > 0/ 0 mds_migrator > > 0/ 0 buffer > > 0/ 0 timer > > 0/ 0 filer > > 0/ 1 striper > > 0/ 0 objecter > > 0/ 0 rados > > 0/ 0 rbd > > 0/ 5 rbd_mirror > > 0/ 5 rbd_replay > > 0/ 0 journaler > > 0/ 0 objectcacher > > 0/ 0 client > > 0/ 0 osd > > 0/ 0 optracker > > > > > > 762454,2 99% > > -1> 2017-02-25 19:32:12.859588 7f8de43b6700 -1 > > bdev(/var/lib/ceph/osd/ceph-0/block) aio_submit retries 1 > > 0> 2017-02-25 19:32:12.898476 7f8de43b6700 -1 *** Caught signal > > (Segmentation fault) ** > > in thread 7f8de43b6700 thread_name:tp_osd_tp > > > > ceph version 11.2.0-41-g6c6b185 (6c6b185bab1e0b7d7446b97d5d314b4dd60360ff) > > 1: (()+0x946e4e) [0x55a0f74c6e4e] > > 2: (()+0x11390) [0x7f8e1c0d0390] > > 3: (()+0x1f8af) [0x7f8e1ce458af] > > 4: (rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice > > const&, rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, > > rocksdb::ReadOptions const&, rocksdb::ImmutableCFOptions const&, > > rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, > > rocksdb::Block*, unsigned int, rocksdb::Slice const&, unsigned > > long)+0x1ce) [0x55a0f758380e] > > 5: (rocksdb::BlockBasedTable::MaybeLoadDataBlockToCache(rocksdb::BlockBasedTable::Rep*, > > rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, > > rocksdb::Slice, > > rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*)+0x3ab) > > [0x55a0f7584feb] > > 6: (rocksdb::BlockBasedTable::NewDataBlockIterator(rocksdb::BlockBasedTable::Rep*, > > rocksdb::ReadOptions const&, rocksdb::Slice const&, > > rocksdb::BlockIter*)+0x301) [0x55a0f75853b1] > > 7: (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&, > > rocksdb::Slice const&, rocksdb::GetContext*, bool)+0x5cb) > > [0x55a0f758b17b] > > 8: (rocksdb::TableCache::Get(rocksdb::ReadOptions const&, > > rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, > > rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::HistogramImpl*, > > bool, int)+0x2ee) [0x55a0f754817e] > > 9: (rocksdb::Version::Get(rocksdb::ReadOptions const&, > > rocksdb::LookupKey const&, std::__cxx11::basic_string<char, > > std::char_traits<char>, std::allocator<char> >*, rocksdb::Status*, > > rocksdb::MergeContext*, rocksdb::RangeDelAggregator*, bool*, bool*, > > unsigned long*)+0x417) [0x55a0f755ed77] > > 10: (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&, > > rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, > > std::__cxx11::basic_string<char, std::char_traits<char>, > > std::allocator<char> >*, bool*)+0x324) [0x55a0f74e68b4] > > 11: (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&, > > rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, > > std::__cxx11::basic_string<char, std::char_traits<char>, > > std::allocator<char> >*)+0x22) [0x55a0f74e7002] > > 12: (RocksDBStore::get(std::__cxx11::basic_string<char, > > std::char_traits<char>, std::allocator<char> > const&, > > std::__cxx11::basic_string<char, std::char_traits<char>, > > std::allocator<char> > const&, ceph::buffer::list*)+0x15d) > > [0x55a0f740c20d] > > 13: (BlueStore::Collection::get_onode(ghobject_t const&, bool)+0x54e) > > [0x55a0f73b62be] > > 14: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, > > ObjectStore::Transaction*)+0x903) [0x55a0f73d4b23] > > 15: (BlueStore::queue_transactions(ObjectStore::Sequencer*, > > std::vector<ObjectStore::Transaction, > > std::allocator<ObjectStore::Transaction> >&, > > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x436) > > [0x55a0f73d6e66] > > 16: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, > > ObjectStore::Transaction&&, Context*, Context*, Context*, > > std::shared_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x1ab) > > [0x55a0f6fe0f1b] > > 17: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, > > std::shared_ptr<OpRequest>)+0x6a) [0x55a0f716072a] > > 18: (ReplicatedBackend::_do_push(std::shared_ptr<OpRequest>)+0x545) > > [0x55a0f7241a15] > > 19: (ReplicatedBackend::handle_message(std::shared_ptr<OpRequest>)+0x320) > > [0x55a0f7250c60] > > 20: (PrimaryLogPG::do_request(std::shared_ptr<OpRequest>&, > > ThreadPool::TPHandle&)+0xbd) [0x55a0f70f708d] > > 21: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > > std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x418) > > [0x55a0f6f91ce8] > > 22: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> > > const&)+0x52) [0x55a0f6f91f42] > > 23: (OSD::ShardedOpWQ::_process(unsigned int, > > ceph::heartbeat_handle_d*)+0x776) [0x55a0f6fb76e6] > > 24: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7f9) > > [0x55a0f769d129] > > 25: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a0f76a04e0] > > 26: (()+0x76ba) [0x7f8e1c0c66ba] > > 27: (clone()+0x6d) [0x7f8e1a79582d] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > > needed to interpret this. > > > > --- logging levels --- > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html