Ceph version 17.2.6 After a power loss event affecting my ceph cluster, I've been putting humpty dumpty back together since. One problem I face is that with objects degraded, rebalancing doesn't run -- and this resulted in several of my fast OSDs filling up. I have 8 OSDs currently down, 100% full (exceeding all the full ratio settings on by default or I toggled to try and keep it together), and when I try to restart them, they fail out. Is there any way to bring these back from the dead? Here's some interesting output from journalctl -xeu on the failed OSD: ceph-osd[2383080]: bluestore::NCB::__restore_allocator::No Valid allocation info on disk (empty file) ceph-osd[2383080]: bluestore(/var/lib/ceph/osd/ceph-242) _init_alloc::NCB::restore_allocator() failed! Run Full Recovery from ONodes (might take a while) ... ceph-osd[2389725]: bluefs _allocate allocation failed, needed 0x3000 ceph-6ab85342-53d6-11ee-88a7-e43d1a153e91-osd-242[2389718]: -2> 2024-11-25T18:31:42.070+0000 7f0adfdef540 -1 bluefs _flush_range_F allocated: 0x0 offset: 0x0 length: 0x230f ceph-osd[2389725]: bluefs _flush_range_F allocated: 0x0 offset: 0x0 length: 0x230f Followed quickly by an abort: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.6/rpm/el8/BUILD/ceph-17.2.6/src/os/bluestore/BlueFS.cc: In funct> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.6/rpm/el8/BUILD/ceph-17.2.6/src/os/bluestore/BlueFS.cc: 3380: ce> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable) 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd7) [0x559bf4361d2f] 2: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x7a9) [0x559bf4b225f9] 3: (BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xa2) [0x559bf4b22812] 4: (BlueFS::fsync(BlueFS::FileWriter*)+0x8e) [0x559bf4b40c3e] 5: (BlueRocksWritableFile::Sync()+0x19) [0x559bf4b51ed9] 6: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x22) [0x559bf507fbd2] 7: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x5aa) [0x559bf51a880a] 8: (rocksdb::WritableFileWriter::Sync(bool)+0x100) [0x559bf51aa0a0] 9: (rocksdb::SyncManifest(rocksdb::Env*, rocksdb::ImmutableDBOptions const*, rocksdb::WritableFileWriter*)+0x10b) [0x559bf51a3bfb] 10: (rocksdb::VersionSet::ProcessManifestWrites(std::deque<rocksdb::VersionSet::ManifestWriter, std::allocator<rocksdb::VersionSet::ManifestWriter> >&, rocksdb::InstrumentedMutex*, rocksdb::FSDirectory*, bool, rocks> 11: (rocksdb::VersionSet::LogAndApply(rocksdb::autovector<rocksdb::ColumnFamilyData*, 8ul> const&, rocksdb::autovector<rocksdb::MutableCFOptions const*, 8ul> const&, rocksdb::autovector<rocksdb::autovector<rocksdb::> 12: (rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*, rocksdb::MutableCFOptions const&, rocksdb::VersionEdit*, rocksdb::InstrumentedMutex*, rocksdb::FSDirectory*, bool, rocksdb::ColumnFamilyOptions const> 13: (rocksdb::DBImpl::DeleteUnreferencedSstFiles()+0xa30) [0x559bf50bd250] 14: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool, unsigned long*)+0x13f1) [0x559bf50d3f21] 15: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::Colu> 16: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFa> 17: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x77a) [0x559bf503766a] 18: (BlueStore::_open_db(bool, bool, bool)+0xbb4) [0x559bf4a4bff4] 19: (BlueStore::_open_db_and_around(bool, bool)+0x500) [0x559bf4a766e0] 20: (BlueStore::_mount()+0x396) [0x559bf4a795d6] 21: (OSD::init()+0x556) [0x559bf44a0eb6] 22: main() 23: __libc_start_main() 24: _start() *** Caught signal (Aborted) ** in thread 7f0adfdef540 thread_name:ceph-osd ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable) 1: /lib64/libpthread.so.0(+0x12cf0) [0x7f0addff1cf0] 2: gsignal() 3: abort() 4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x197) [0x559bf4361def] 5: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x7a9) [0x559bf4b225f9] 6: (BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xa2) [0x559bf4b22812] 7: (BlueFS::fsync(BlueFS::FileWriter*)+0x8e) [0x559bf4b40c3e] 8: (BlueRocksWritableFile::Sync()+0x19) [0x559bf4b51ed9] 9: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x22) [0x559bf507fbd2] 10: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x5aa) [0x559bf51a880a] 11: (rocksdb::WritableFileWriter::Sync(bool)+0x100) [0x559bf51aa0a0] 12: (rocksdb::SyncManifest(rocksdb::Env*, rocksdb::ImmutableDBOptions const*, rocksdb::WritableFileWriter*)+0x10b) [0x559bf51a3bfb] 13: (rocksdb::VersionSet::ProcessManifestWrites(std::deque<rocksdb::VersionSet::ManifestWriter, std::allocator<rocksdb::VersionSet::ManifestWriter> >&, rocksdb::InstrumentedMutex*, rocksdb::FSDirectory*, bool, rocks> 14: (rocksdb::VersionSet::LogAndApply(rocksdb::autovector<rocksdb::ColumnFamilyData*, 8ul> const&, rocksdb::autovector<rocksdb::MutableCFOptions const*, 8ul> const&, rocksdb::autovector<rocksdb::autovector<rocksdb::> 15: (rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*, rocksdb::MutableCFOptions const&, rocksdb::VersionEdit*, rocksdb::InstrumentedMutex*, rocksdb::FSDirectory*, bool, rocksdb::ColumnFamilyOptions const> 16: (rocksdb::DBImpl::DeleteUnreferencedSstFiles()+0xa30) [0x559bf50bd250] 17: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool, unsigned long*)+0x13f1) [0x559bf50d3f21] 18: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::Colu> 19: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFa> 20: (RocksDBStore::do_open(std::ostream&, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x77a) [0x559bf503766a] 21: (BlueStore::_open_db(bool, bool, bool)+0xbb4) [0x559bf4a4bff4] 22: (BlueStore::_open_db_and_around(bool, bool)+0x500) [0x559bf4a766e0] 23: (BlueStore::_mount()+0x396) [0x559bf4a795d6] 24: (OSD::init()+0x556) [0x559bf44a0eb6] 25: main() 26: __libc_start_main() 27: _start() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx