Re: down OSDs, Bluestore out of space, unable to restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi John,

you haven't described your OSD volume configuration but you might want to try adding standalone DB volume if OSD uses LVM and has single main device only.

'ceph-volume lvm new-db' command is the preferred way of doing that, see

https://docs.ceph.com/en/quincy/ceph-volume/lvm/newdb/


Thanks,

Igor

On 25.11.2024 21:37, John Jasen wrote:
Ceph version 17.2.6

After a power loss event affecting my ceph cluster, I've been putting
humpty dumpty back together since.

One problem I face is that with objects degraded, rebalancing doesn't run
-- and this resulted in several of my fast OSDs filling up.

I have 8 OSDs currently down, 100% full (exceeding all the full ratio
settings on by default or I toggled to try and keep it together), and when
I try to restart them, they fail out. Is there any way to bring these back
from the dead?

Here's some interesting output from journalctl -xeu on the failed OSD:

ceph-osd[2383080]: bluestore::NCB::__restore_allocator::No Valid allocation
info on disk (empty file)
ceph-osd[2383080]: bluestore(/var/lib/ceph/osd/ceph-242)
_init_alloc::NCB::restore_allocator() failed! Run Full Recovery from ONodes
(might take a while) ...

ceph-osd[2389725]: bluefs _allocate allocation failed, needed 0x3000

ceph-6ab85342-53d6-11ee-88a7-e43d1a153e91-osd-242[2389718]:     -2>
2024-11-25T18:31:42.070+0000 7f0adfdef540 -1 bluefs _flush_range_F
allocated: 0x0 offset: 0x0 length: 0x230f
ceph-osd[2389725]: bluefs _flush_range_F allocated: 0x0 offset: 0x0 length:
0x230f

Followed quickly by an abort:

/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.6/rpm/el8/BUILD/ceph-17.2.6/src/os/bluestore/BlueFS.cc:
In funct>

/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.6/rpm/el8/BUILD/ceph-17.2.6/src/os/bluestore/BlueFS.cc:
3380: ce>

                                                              ceph version
17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
                                                              1:
(ceph::__ceph_abort(char const*, int, char const*,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&)+0xd7) [0x559bf4361d2f]
                                                              2:
(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned
long)+0x7a9) [0x559bf4b225f9]
                                                              3:
(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xa2) [0x559bf4b22812]
                                                              4:
(BlueFS::fsync(BlueFS::FileWriter*)+0x8e) [0x559bf4b40c3e]
                                                              5:
(BlueRocksWritableFile::Sync()+0x19) [0x559bf4b51ed9]
                                                              6:
(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&,
rocksdb::IODebugContext*)+0x22) [0x559bf507fbd2]
                                                              7:
(rocksdb::WritableFileWriter::SyncInternal(bool)+0x5aa) [0x559bf51a880a]
                                                              8:
(rocksdb::WritableFileWriter::Sync(bool)+0x100) [0x559bf51aa0a0]
                                                              9:
(rocksdb::SyncManifest(rocksdb::Env*, rocksdb::ImmutableDBOptions const*,
rocksdb::WritableFileWriter*)+0x10b) [0x559bf51a3bfb]
                                                              10:
(rocksdb::VersionSet::ProcessManifestWrites(std::deque<rocksdb::VersionSet::ManifestWriter,
std::allocator<rocksdb::VersionSet::ManifestWriter> >&,
rocksdb::InstrumentedMutex*, rocksdb::FSDirectory*, bool, rocks>
                                                              11:
(rocksdb::VersionSet::LogAndApply(rocksdb::autovector<rocksdb::ColumnFamilyData*,
8ul> const&, rocksdb::autovector<rocksdb::MutableCFOptions const*, 8ul>
const&, rocksdb::autovector<rocksdb::autovector<rocksdb::>
                                                              12:
(rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*,
rocksdb::MutableCFOptions const&, rocksdb::VersionEdit*,
rocksdb::InstrumentedMutex*, rocksdb::FSDirectory*, bool,
rocksdb::ColumnFamilyOptions const>
                                                              13:
(rocksdb::DBImpl::DeleteUnreferencedSstFiles()+0xa30) [0x559bf50bd250]
                                                              14:
(rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor,
std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool,
unsigned long*)+0x13f1) [0x559bf50d3f21]
                                                              15:
(rocksdb::DBImpl::Open(rocksdb::DBOptions const&,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor,
std::allocator<rocksdb::Colu>
                                                              16:
(rocksdb::DB::Open(rocksdb::DBOptions const&,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor,
std::allocator<rocksdb::ColumnFa>
                                                              17:
(RocksDBStore::do_open(std::ostream&, bool, bool,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&)+0x77a) [0x559bf503766a]
                                                              18:
(BlueStore::_open_db(bool, bool, bool)+0xbb4) [0x559bf4a4bff4]
                                                              19:
(BlueStore::_open_db_and_around(bool, bool)+0x500) [0x559bf4a766e0]
                                                              20:
(BlueStore::_mount()+0x396) [0x559bf4a795d6]
                                                              21:
(OSD::init()+0x556) [0x559bf44a0eb6]
                                                              22: main()
                                                              23:
__libc_start_main()
                                                              24: _start()

*** Caught signal (Aborted) **
                                                              in thread
7f0adfdef540 thread_name:ceph-osd

                                                              ceph version
17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
                                                              1:
/lib64/libpthread.so.0(+0x12cf0) [0x7f0addff1cf0]
                                                              2: gsignal()
                                                              3: abort()
                                                              4:
(ceph::__ceph_abort(char const*, int, char const*,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&)+0x197) [0x559bf4361def]
                                                              5:
(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned
long)+0x7a9) [0x559bf4b225f9]
                                                              6:
(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xa2) [0x559bf4b22812]
                                                              7:
(BlueFS::fsync(BlueFS::FileWriter*)+0x8e) [0x559bf4b40c3e]
                                                              8:
(BlueRocksWritableFile::Sync()+0x19) [0x559bf4b51ed9]
                                                              9:
(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&,
rocksdb::IODebugContext*)+0x22) [0x559bf507fbd2]
                                                              10:
(rocksdb::WritableFileWriter::SyncInternal(bool)+0x5aa) [0x559bf51a880a]
                                                              11:
(rocksdb::WritableFileWriter::Sync(bool)+0x100) [0x559bf51aa0a0]
                                                              12:
(rocksdb::SyncManifest(rocksdb::Env*, rocksdb::ImmutableDBOptions const*,
rocksdb::WritableFileWriter*)+0x10b) [0x559bf51a3bfb]
                                                              13:
(rocksdb::VersionSet::ProcessManifestWrites(std::deque<rocksdb::VersionSet::ManifestWriter,
std::allocator<rocksdb::VersionSet::ManifestWriter> >&,
rocksdb::InstrumentedMutex*, rocksdb::FSDirectory*, bool, rocks>
                                                              14:
(rocksdb::VersionSet::LogAndApply(rocksdb::autovector<rocksdb::ColumnFamilyData*,
8ul> const&, rocksdb::autovector<rocksdb::MutableCFOptions const*, 8ul>
const&, rocksdb::autovector<rocksdb::autovector<rocksdb::>
                                                              15:
(rocksdb::VersionSet::LogAndApply(rocksdb::ColumnFamilyData*,
rocksdb::MutableCFOptions const&, rocksdb::VersionEdit*,
rocksdb::InstrumentedMutex*, rocksdb::FSDirectory*, bool,
rocksdb::ColumnFamilyOptions const>
                                                              16:
(rocksdb::DBImpl::DeleteUnreferencedSstFiles()+0xa30) [0x559bf50bd250]
                                                              17:
(rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor,
std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool,
unsigned long*)+0x13f1) [0x559bf50d3f21]
                                                              18:
(rocksdb::DBImpl::Open(rocksdb::DBOptions const&,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor,
std::allocator<rocksdb::Colu>
                                                              19:
(rocksdb::DB::Open(rocksdb::DBOptions const&,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor,
std::allocator<rocksdb::ColumnFa>
                                                              20:
(RocksDBStore::do_open(std::ostream&, bool, bool,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&)+0x77a) [0x559bf503766a]
                                                              21:
(BlueStore::_open_db(bool, bool, bool)+0xbb4) [0x559bf4a4bff4]
                                                              22:
(BlueStore::_open_db_and_around(bool, bool)+0x500) [0x559bf4a766e0]
                                                              23:
(BlueStore::_mount()+0x396) [0x559bf4a795d6]
                                                              24:
(OSD::init()+0x556) [0x559bf44a0eb6]
                                                              25: main()
                                                              26:
__libc_start_main()
                                                              27: _start()
                                                              NOTE: a copy
of the executable, or `objdump -rdS <executable>` is needed to interpret
this.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux