Hi, While trying to get an OSD back in the test cluster, which had been dropped out for unknown reason, we see a RocksDB Segmentation fault during "compaction". I increased debugging to 20/20 for OSD / RocksDB, see part of the logfile below: ... 49477, 49476, 49475, 49474, 49473, 49472, 49471, 49470, 49469, 49468, 49467], "files_L1": [49465], "score": 1138.25, "input_data_size": 82872298} -1> 2018-01-12 08:48:23.915753 7f91eaf89e40 1 freelist init 0> 2018-01-12 08:48:45.630418 7f91eaf89e40 -1 *** Caught signal (Segmentation fault) ** in thread 7f91eaf89e40 thread_name:ceph-osd ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable) 1: (()+0xa65824) [0x55a124693824] 2: (()+0x11390) [0x7f91e9238390] 3: (()+0x1f8af) [0x7f91eab658af] 4: (rocksdb::BlockBasedTable::PutDataBlockToCache(rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Cache*, rocksdb::Cache*, rocksdb::ReadOptions const&, rocksdb::ImmutableCFOptions const&, rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, rocksdb::Block*, unsigned int, rocksdb::Slice const&, unsigned long, bool, rocksdb::Cache::Priority)+0x1d9) [0x55a124a64e49] 5: (rocksdb::BlockBasedTable::MaybeLoadDataBlockToCache(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::Slice, rocksdb::BlockBasedTable::CachableEntry<rocksdb::Block>*, bool)+0x3b7) [0x55a124a66827] 6: (rocksdb::BlockBasedTable::NewDataBlockIterator(rocksdb::BlockBasedTable::Rep*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::BlockIter*, bool, rocksdb::Status)+0x2ac) [0x55a124a66b6c] 7: (rocksdb::BlockBasedTable::BlockEntryIteratorState::NewSecondaryIterator(rocksdb::Slice const&)+0x97) [0x55a124a6f2e7] 8: (()+0xe6c48e) [0x55a124a9a48e] 9: (()+0xe6ca06) [0x55a124a9aa06] 10: (rocksdb::MergingIterator::Seek(rocksdb::Slice const&)+0x126) [0x55a124a7bc86] 11: (rocksdb::DBIter::Seek(rocksdb::Slice const&)+0x20a) [0x55a124b1bdaa] 12: (RocksDBStore::RocksDBWholeSpaceIteratorImpl::lower_bound(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x46) [0x55a1245d4676] 13: (BitmapFreelistManager::init(unsigned long)+0x2dc) [0x55a12463976c] 14: (BlueStore::_open_fm(bool)+0xc00) [0x55a124526c50] 15: (BlueStore::_mount(bool)+0x3dc) [0x55a12459aa1c] 16: (OSD::init()+0x3e2) [0x55a1241064e2] 17: (main()+0x2f07) [0x55a1240181d7] 18: (__libc_start_main()+0xf0) [0x7f91e81be830] 19: (_start()+0x29) [0x55a1240a37f9] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. The disk in question is very old (powered on ~ 8 years), so it might be that part of the data is corrupt. Would RocksDB throw a similar error like this in that case? Gr. Stefan P.s. We're trying to learn as much as possible when things do not go according to plan. There is way more debug info available in case anyone is interested. -- | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com