Hi, Do you have scsi errors around the time of the crash? `journalctl -k` and look for scsi medium errors. Cheers, Dan On Mon, Aug 17, 2020 at 3:50 PM EDH - Manuel Rios <mriosfer@xxxxxxxxxxxxxxxx> wrote: > > Hi , Today one of our SSD dedicated to RGW index crashed, maybe a bug or just osd crashed. > > Our current versión 14.2.11, today we're under heavy object process... aprox 60TB data. > > ceph version 14.2.11 (f7fdb2f52131f54b891a2ec99d8205561242cdaf) nautilus (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x563f96b550e5] > 2: (()+0x4d72ad) [0x563f96b552ad] > 3: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, char*)+0xf0e) [0x563f9715aa9e] > 4: (BlueRocksRandomAccessFile::Prefetch(unsigned long, unsigned long)+0x2a) [0x563f9718453a] > 5: (rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::InitDataBlock()+0x29f) [0x563f9772697f] > 6: (rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::FindKeyForward()+0x1c0) [0x563f97726bb0] > 7: (()+0x102fd29) [0x563f976add29] > 8: (rocksdb::MergingIterator::Next()+0x42) [0x563f97738162] > 9: (rocksdb::DBIter::Next()+0x1f3) [0x563f97641e53] > 10: (RocksDBStore::RocksDBWholeSpaceIteratorImpl::next()+0x2d) [0x563f975b36bd] > 11: (RocksDBStore::RocksDBTransactionImpl::rm_range_keys(std::string const&, std::string const&, std::string const&)+0x567) [0x563f975beab7] > 12: (BlueStore::_do_omap_clear(BlueStore::TransContext*, std::string const&, unsigned long)+0x72) [0x563f9708f2f2] > 13: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xc16) [0x563f970a6026] > 14: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x5f) [0x563f970a6cbf] > 15: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x13f5) [0x563f970acca5] > 16: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transactio n> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x370) [0x563f970c1100] > 17: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TP Handle*)+0x7f) [0x563f96cb6d3f] > 18: (non-virtual thunk to PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x563f96e3015f] > 19: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x4a0) [0x563f96f2a970] > 20: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x298) [0x563f96f32d38] > 21: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x4a) [0x563f96e4486a] > 22: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x5b3) [0x563f96df4c63] > 23: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x362) [0x563f96c34da2] > 24: (PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62) [0x563f96ec37c2] > 25: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x90f) [0x563f96c4fd3f] > 26: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b6) [0x563f97203c46] > 27: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x563f97206760] > 28: (()+0x7dd5) [0x7f1e504eddd5] > 29: (clone()+0x6d) [0x7f1e4f3ad02d] > > 0> 2020-08-17 15:45:27.609 7f1e2fa82700 -1 *** Caught signal (Aborted) ** > in thread 7f1e2fa82700 thread_name:tp_osd_tp > > ceph version 14.2.11 (f7fdb2f52131f54b891a2ec99d8205561242cdaf) nautilus (stable) > 1: (()+0xf5d0) [0x7f1e504f55d0] > 2: (gsignal()+0x37) [0x7f1e4f2e52c7] > 3: (abort()+0x148) [0x7f1e4f2e69b8] > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x563f96b55134] > 5: (()+0x4d72ad) [0x563f96b552ad] > 6: (BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v14_2_0::list*, char*)+0xf0e) [0x563f9715aa9e] > 7: (BlueRocksRandomAccessFile::Prefetch(unsigned long, unsigned long)+0x2a) [0x563f9718453a] > 8: (rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::InitDataBlock()+0x29f) [0x563f9772697f] > 9: (rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::FindKeyForward()+0x1c0) [0x563f97726bb0] > 10: (()+0x102fd29) [0x563f976add29] > 11: (rocksdb::MergingIterator::Next()+0x42) [0x563f97738162] > 12: (rocksdb::DBIter::Next()+0x1f3) [0x563f97641e53] > 13: (RocksDBStore::RocksDBWholeSpaceIteratorImpl::next()+0x2d) [0x563f975b36bd] > 14: (RocksDBStore::RocksDBTransactionImpl::rm_range_keys(std::string const&, std::string const&, std::string const&)+0x567) [0x563f975beab7] > 15: (BlueStore::_do_omap_clear(BlueStore::TransContext*, std::string const&, unsigned long)+0x72) [0x563f9708f2f2] > 16: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xc16) [0x563f970a6026] > 17: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x5f) [0x563f970a6cbf] > 18: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x13f5) [0x563f970acca5] > 19: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transactio n> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x370) [0x563f970c1100] > 20: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TP Handle*)+0x7f) [0x563f96cb6d3f] > 21: (non-virtual thunk to PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x563f96e3015f] > 22: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x4a0) [0x563f96f2a970] > 23: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x298) [0x563f96f32d38] > 24: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x4a) [0x563f96e4486a] > 25: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x5b3) [0x563f96df4c63] > 26: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x362) [0x563f96c34da2] > 27: (PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x62) [0x563f96ec37c2] > 28: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x90f) [0x563f96c4fd3f] > 29: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b6) [0x563f97203c46] > 30: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x563f97206760] > 31: (()+0x7dd5) [0x7f1e504eddd5] > 32: (clone()+0x6d) [0x7f1e4f3ad02d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > > Any ideas or similar situation? > > > Manuel Ríos Fernández > CEO - Business development > 677677179 · mriosfer@xxxxxxxxxxxxxxxx<mailto:mriosfer@xxxxxxxxxxxxxxxx> > No me imprimas si no es necesario. Protejamos el medio ambiente > Este mensaje y, en su caso, los ficheros anexos son confidenciales, especialmente en lo que respecta a los datos personales, y se dirigen exclusivamente al destinatario referenciado. > Si usted no lo es y lo ha recibido por error o tiene conocimiento del mismo por cualquier motivo, le rogamos que nos lo comunique por este medio y proceda a destruirlo o borrarlo, y que en todo caso se abstenga de utilizar, reproducir, alterar, archivar o comunicar a terceros el presente mensaje y ficheros anexos, todo ello bajo pena de incurrir en responsabilidades legales. El emisor no garantiza la integridad, rapidez o seguridad del presente correo, ni se responsabiliza de posibles perjuicios derivados de la captura, incorporaciones de virus o cualesquiera otras manipulaciones efectuadas por terceros. > This e-mail message and all attachments transmitted with it may contain legally privileged, proprietary and/or confidential information intended solely for the use of the addressee. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution, duplication or other use of this message and/or its attachments is strictly prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message and its attachments. Thank you. > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx