Re: 3 OSDs stopped and unable to restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I should read call stack more carefully... It's not about lacking free space - this is rather the bug from this ticket:

http://tracker.ceph.com/issues/40080


You should upgrade to v14.2.2 (once it's available) or temporarily switch to stupid allocator as a workaround.


Thanks,

Igor



On 7/8/2019 8:00 PM, Igor Fedotov wrote:

Hi Brett,

looks like BlueStore is unable to allocate additional space for BlueFS at main device. It's either lacking free space or it's too fragmented...

Would you share osd log, please?

Also please run "ceph-bluestore-tool --path <substitute with path-to-osd!!!> bluefs-bdev-sizes" and share the output.

Thanks,

Igor

On 7/3/2019 9:59 PM, Brett Chancellor wrote:
Hi All! Today I've had 3 OSDs stop themselves and are unable to restart, all with the same error. These OSDs are all on different hosts. All are running 14.2.1

I did try the following two commands
- ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-80 list > keys
  ## This failed with the same error below
- ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-80 fsck
 ## After a couple of hours returned...
2019-07-03 18:30:02.095 7fe7c1c1ef00 -1 bluestore(/var/lib/ceph/osd/ceph-80) fsck warning: legacy statfs record found, suggest to run store repair to get consistent statistic reports
fsck success


## Error when trying to start one of the OSDs
   -12> 2019-07-03 18:36:57.450 7f5e42366700 -1 *** Caught signal (Aborted) **
 in thread 7f5e42366700 thread_name:rocksdb:low0

 ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable)
 1: (()+0xf5d0) [0x7f5e50bd75d0]
 2: (gsignal()+0x37) [0x7f5e4f9ce207]
 3: (abort()+0x148) [0x7f5e4f9cf8f8]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x55a7aaee96ab]
 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x55a7aaee982a]
 6: (interval_set<unsigned long, std::map<unsigned long, unsigned long, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, unsigned long> > > >::insert(unsigned long, unsigned long, unsigned long*, unsigned long*)+0x3c6) [0x55a7ab212a66]
 7: (BlueStore::allocate_bluefs_freespace(unsigned long, unsigned long, std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)4, bluestore_pextent_t> >*)+0x74e) [0x55a7ab48253e]
 8: (BlueFS::_expand_slow_device(unsigned long, std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)4, bluestore_pextent_t> >&)+0x111) [0x55a7ab59e921]
 9: (BlueFS::_allocate(unsigned char, unsigned long, bluefs_fnode_t*)+0x68b) [0x55a7ab59f68b]
 10: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0xe5) [0x55a7ab59fce5]
 11: (BlueFS::_flush(BlueFS::FileWriter*, bool)+0x10b) [0x55a7ab5a1b4b]
 12: (BlueRocksWritableFile::Flush()+0x3d) [0x55a7ab5bf84d]
 13: (rocksdb::WritableFileWriter::Flush()+0x19e) [0x55a7abbedd0e]
 14: (rocksdb::WritableFileWriter::Sync(bool)+0x2e) [0x55a7abbedfee]
 15: (rocksdb::CompactionJob::FinishCompactionOutputFile(rocksdb::Status const&, rocksdb::CompactionJob::SubcompactionState*, rocksdb::RangeDelAggregator*, CompactionIterationStats*, rocksdb::Slice const*)+0xbaa) [0x55a7abc3b73a]
 16: (rocksdb::CompactionJob::ProcessKeyValueCompaction(rocksdb::CompactionJob::SubcompactionState*)+0x7d0) [0x55a7abc3f150]
 17: (rocksdb::CompactionJob::Run()+0x298) [0x55a7abc40618]
 18: (rocksdb::DBImpl::BackgroundCompaction(bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::DBImpl::PrepickedCompaction*)+0xcb7) [0x55a7aba7fb67]
 19: (rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority)+0xd0) [0x55a7aba813c0]
 20: (rocksdb::DBImpl::BGWorkCompaction(void*)+0x3a) [0x55a7aba8190a]
 21: (rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long)+0x264) [0x55a7abc8d9c4]
 22: (rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)+0x4f) [0x55a7abc8db4f]
 23: (()+0x129dfff) [0x55a7abd1afff]
 24: (()+0x7dd5) [0x7f5e50bcfdd5]
 25: (clone()+0x6d) [0x7f5e4fa95ead]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux