Hi Zakhar,
please see my reply for the post on the similar issue at:
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
Thanks,
Igor
On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
Hi,
After upgrading to Ceph 16.2.14 we had several OSD crashes
in bstore_kv_sync thread:
1. "assert_thread_name": "bstore_kv_sync",
2. "backtrace": [
3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
4. "gsignal()",
5. "abort()",
6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x1a9) [0x564dc5f87d0b]",
7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
const&)+0x15e) [0x564dc6604a9e]",
9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned
long)+0x77d) [0x564dc66951cd]",
10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
[0x564dc6695670]",
11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
[0x564dc6c761c2]",
15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]",
16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
long)+0x309) [0x564dc6b780c9]",
17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned
long, bool, unsigned long*, unsigned long,
rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84) [0x564dc6b1f644]",
20. "(RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a)
[0x564dc6b2004a]",
21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
24. "clone()"
25. ],
I am attaching two instances of crash info for further reference:
https://pastebin.com/E6myaHNU
OSD configuration is rather simple and close to default:
osd.6 dev bluestore_cache_size_hdd 4294967296
osd.6 dev
bluestore_cache_size_ssd 4294967296
osd advanced debug_rocksdb
1/5 osd
advanced osd_max_backfills 2
osd basic
osd_memory_target 17179869184
osd advanced osd_recovery_max_active
2 osd
advanced osd_scrub_sleep 0.100000
osd advanced
rbd_balance_parent_reads false
debug_rocksdb is a recent change, otherwise this configuration has been
running without issues for months. The crashes happened on two different
hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD
block) don't exhibit any issues. We have not experienced such crashes with
Ceph < 16.2.14.
Is this a known issue, or should I open a bug report?
Best regards,
Zakhar
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx