Re: 3 OSDs stopped and unable to restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Has anybody else run into this? It seems to be slowly spreading to other OSDs, maybe it gets to a bad pg in the backfill process and kills off another OSD (just guessing since the failures are hours apart).  It's kind of a pain because I have ton continually rebuild these OSDs before the cluster runs out of space.

On Wed, Jul 3, 2019 at 2:59 PM Brett Chancellor <bchancellor@xxxxxxxxxxxxxx> wrote:
Hi All! Today I've had 3 OSDs stop themselves and are unable to restart, all with the same error. These OSDs are all on different hosts. All are running 14.2.1

I did try the following two commands
- ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-80 list > keys
  ## This failed with the same error below
- ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-80 fsck
 ## After a couple of hours returned...
2019-07-03 18:30:02.095 7fe7c1c1ef00 -1 bluestore(/var/lib/ceph/osd/ceph-80) fsck warning: legacy statfs record found, suggest to run store repair to get consistent statistic reports
fsck success


## Error when trying to start one of the OSDs
   -12> 2019-07-03 18:36:57.450 7f5e42366700 -1 *** Caught signal (Aborted) **
 in thread 7f5e42366700 thread_name:rocksdb:low0

 ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable)
 1: (()+0xf5d0) [0x7f5e50bd75d0]
 2: (gsignal()+0x37) [0x7f5e4f9ce207]
 3: (abort()+0x148) [0x7f5e4f9cf8f8]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x55a7aaee96ab]
 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x55a7aaee982a]
 6: (interval_set<unsigned long, std::map<unsigned long, unsigned long, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, unsigned long> > > >::insert(unsigned long, unsigned long, unsigned long*, unsigned long*)+0x3c6) [0x55a7ab212a66]
 7: (BlueStore::allocate_bluefs_freespace(unsigned long, unsigned long, std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)4, bluestore_pextent_t> >*)+0x74e) [0x55a7ab48253e]
 8: (BlueFS::_expand_slow_device(unsigned long, std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)4, bluestore_pextent_t> >&)+0x111) [0x55a7ab59e921]
 9: (BlueFS::_allocate(unsigned char, unsigned long, bluefs_fnode_t*)+0x68b) [0x55a7ab59f68b]
 10: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0xe5) [0x55a7ab59fce5]
 11: (BlueFS::_flush(BlueFS::FileWriter*, bool)+0x10b) [0x55a7ab5a1b4b]
 12: (BlueRocksWritableFile::Flush()+0x3d) [0x55a7ab5bf84d]
 13: (rocksdb::WritableFileWriter::Flush()+0x19e) [0x55a7abbedd0e]
 14: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55a7abbedfee]
 15: (rocksdb::CompactionJob::FinishCompactionOutputFile(rocksdb::Status const&, rocksdb::CompactionJob::SubcompactionState*, rocksdb::RangeDelAggregator*, CompactionIterationStats*, rocksdb::Slice const*)+0xbaa) [0x55a7abc3b73a]
 16: (rocksdb::CompactionJob::ProcessKeyValueCompaction(rocksdb::CompactionJob::SubcompactionState*)+0x7d0) [0x55a7abc3f150]
 17: (rocksdb::CompactionJob::Run()+0x298) [0x55a7abc40618]
 18: (rocksdb::DBImpl::BackgroundCompaction(bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::DBImpl::PrepickedCompaction*)+0xcb7) [0x55a7aba7fb67]
 19: (rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority)+0xd0) [0x55a7aba813c0]
 20: (rocksdb::DBImpl::BGWorkCompaction(void*)+0x3a) [0x55a7aba8190a]
 21: (rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long)+0x264) [0x55a7abc8d9c4]
 22: (rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)+0x4f) [0x55a7abc8db4f]
 23: (()+0x129dfff) [0x55a7abd1afff]
 24: (()+0x7dd5) [0x7f5e50bcfdd5]
 25: (clone()+0x6d) [0x7f5e4fa95ead]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux