Rocksdb Segmentation fault during compaction (on OSD)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

While trying to get an OSD back in the test cluster, which had been
dropped out for unknown reason, we see a RocksDB Segmentation fault
during "compaction". I increased debugging to 20/20 for OSD / RocksDB,
see part of the logfile below:

... 49477, 49476, 49475, 49474, 49473, 49472, 49471, 49470, 49469, 49468,
49467], "files_L1": [49465], "score": 1138.25, "input_data_size": 82872298}
    -1> 2018-01-12 08:48:23.915753 7f91eaf89e40  1 freelist init
     0> 2018-01-12 08:48:45.630418 7f91eaf89e40 -1 *** Caught signal (Segmentation fault) **
 in thread 7f91eaf89e40 thread_name:ceph-osd

 ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable)
 1: (()+0xa65824) [0x55a124693824]
 2: (()+0x11390) [0x7f91e9238390]
 3: (()+0x1f8af) [0x7f91eab658af]
 4: (rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, rocksdb::ReadOptions const&, rocksdb::ImmutableCFOptions const&, rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, rocksdb::Block*, unsigned int, rocksdb::Slice const&, unsigned long, bool, rocksdb::Cache::Priority)+0x1d9) [0x55a124a64e49]
 5: (rocksdb::BlockBasedTable::MaybeLoadDataBlockToCache(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::Slice, rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool)+0x3b7) [0x55a124a66827]
 6: (rocksdb::BlockBasedTable::NewDataBlockIterator(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::BlockIter*, bool, rocksdb::Status)+0x2ac) [0x55a124a66b6c]
 7: (rocksdb::BlockBasedTable::BlockEntryIteratorState::NewSecondaryIterator(rocksdb::Slice const&)+0x97) [0x55a124a6f2e7]
 8: (()+0xe6c48e) [0x55a124a9a48e]
 9: (()+0xe6ca06) [0x55a124a9aa06]
 10: (rocksdb::MergingIterator::Seek(rocksdb::Slice const&)+0x126) [0x55a124a7bc86]
 11: (rocksdb::DBIter::Seek(rocksdb::Slice const&)+0x20a) [0x55a124b1bdaa]
 12: (RocksDBStore::RocksDBWholeSpaceIteratorImpl::lower_bound(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x46) [0x55a1245d4676]
 13: (BitmapFreelistManager::init(unsigned long)+0x2dc) [0x55a12463976c]
 14: (BlueStore::_open_fm(bool)+0xc00) [0x55a124526c50]
 15: (BlueStore::_mount(bool)+0x3dc) [0x55a12459aa1c]
 16: (OSD::init()+0x3e2) [0x55a1241064e2]
 17: (main()+0x2f07) [0x55a1240181d7]
 18: (__libc_start_main()+0xf0) [0x7f91e81be830]
 19: (_start()+0x29) [0x55a1240a37f9]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

The disk in question is very old (powered on ~ 8 years), so it might be that
part of the data is corrupt. Would RocksDB throw a similar error like this in that case?

Gr. Stefan

P.s. We're trying to learn as much as possible when things do not go according
to plan. There is way more debug info available in case anyone is interested. 



-- 
| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux