Re: Pacific bluestore_volume_selection_policy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Igor,

That’s correct (shown below).
Would it be helpful for me to add logs/uploaded crash UUID’s to 53906 <https://tracker.ceph.com/issues/53906>, 53907 <https://tracker.ceph.com/issues/53907>, 54209 <https://tracker.ceph.com/issues/54209>, 62928 <https://tracker.ceph.com/issues/62928>, 63110 <https://tracker.ceph.com/issues/63110>, 63161 <https://tracker.ceph.com/issues/63161>, 63352 <https://tracker.ceph.com/issues/63352>?
Or maybe open a new tracker to track that the parameter change isn’t being properly persisted or whatever appears to be happening?

Thanks,
Reed

> /build/ceph-16.2.14/src/os/bluestore/BlueStore.h: 3870: FAILED ceph_assert(cur >= p.length)
> 
>  ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x55d51970a987]
>  2: /usr/bin/ceph-osd(+0xad3b8f) [0x55d51970ab8f]
>  3: (RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t const&)+0x112) [0x55d519e040f2]
>  4: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x69d) [0x55d519ea0fad]
>  5: (BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xaa) [0x55d519ea14ea]
>  6: (BlueFS::fsync(BlueFS::FileWriter*)+0x7d) [0x55d519ec61ed]
>  7: (BlueRocksWritableFile::Sync()+0x19) [0x55d519ed5a59]
>  8: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x52) [0x55d51a3e37ce]
>  9: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x216) [0x55d51a5eddac]
>  10: (rocksdb::WritableFileWriter::Sync(bool)+0x17b) [0x55d51a5ed785]
>  11: (rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long)+0x39a) [0x55d51a441bf8]
>  12: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x135e) [0x55d51a43d96c]
>  13: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x5d) [0x55d51a43c56f]
>  14: (RocksDBStore::submit_common(rocksdb::WriteOptions&, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x85) [0x55d51a388635]
>  15: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9b) [0x55d51a38904b]
>  16: (BlueStore::_kv_sync_thread()+0x22bc) [0x55d519e016dc]
>  17: (BlueStore::KVSyncThread::entry()+0x11) [0x55d519e2de71]
>  18: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f490cf23609]
>  19: clone()
> 
>      0> 2024-01-10T11:39:05.922-0500 7f48f978d700 -1 *** Caught signal (Aborted) **
>  in thread 7f48f978d700 thread_name:bstore_kv_sync
> 
>  ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)
>  1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f490cf2f420]
>  2: gsignal()
>  3: abort()
>  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1ad) [0x55d51970a9e2]
>  5: /usr/bin/ceph-osd(+0xad3b8f) [0x55d51970ab8f]
>  6: (RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t const&)+0x112) [0x55d519e040f2]
>  7: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x69d) [0x55d519ea0fad]
>  8: (BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0xaa) [0x55d519ea14ea]
>  9: (BlueFS::fsync(BlueFS::FileWriter*)+0x7d) [0x55d519ec61ed]
>  10: (BlueRocksWritableFile::Sync()+0x19) [0x55d519ed5a59]
>  11: (rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x52) [0x55d51a3e37ce]
>  12: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x216) [0x55d51a5eddac]
>  13: (rocksdb::WritableFileWriter::Sync(bool)+0x17b) [0x55d51a5ed785]
>  14: (rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long)+0x39a) [0x55d51a441bf8]
>  15: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x135e) [0x55d51a43d96c]
>  16: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x5d) [0x55d51a43c56f]
>  17: (RocksDBStore::submit_common(rocksdb::WriteOptions&, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x85) [0x55d51a388635]
>  18: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9b) [0x55d51a38904b]
>  19: (BlueStore::_kv_sync_thread()+0x22bc) [0x55d519e016dc]
>  20: (BlueStore::KVSyncThread::entry()+0x11) [0x55d519e2de71]
>  21: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f490cf23609]
>  22: clone()
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

> On Jan 10, 2024, at 12:06 PM, Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:
> 
> Hi Reed,
> 
> it looks to me like your settings aren't effective. You might want to check OSD log rather than crash info and see the assertion's backtrace. 
> 
> Does it mention RocksDBBlueFSVolumeSelector as the one in https://tracker.ceph.com/issues/53906 <https://tracker.ceph.com/issues/53906>:
> 
> ceph version 17.0.0-10229-g7e035110 (7e035110784fba02ba81944e444be9a36932c6a3) quincy (dev)
>  1: /lib64/libpthread.so.0(+0x12c20) [0x7f2beb318c20]
>  2: gsignal()
>  3: abort()
>  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1b0) [0x56347eb33bec]
>  5: /usr/bin/ceph-osd(+0x5d5daf) [0x56347eb33daf]
>  6: (RocksDBBlueFSVolumeSelector::add_usage(void*, bluefs_fnode_t const&)+0) [0x56347f1f7d00]
>  7: (BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x735) [0x56347f295b45]
> 
> 
> If so - then there is still a mess with proper parameter changes.
> 
> Thanks
> Igor
> 
> On 10/01/2024 20:13, Reed Dier wrote:
>> Well, sadly, that setting doesn’t seem to resolve the issue.
>> 
>> I set the value in ceph.conf for the OSDs with small WAL/DB devices that keep running into the issue,
>> 
>>> $  ceph tell osd.12 config show | grep bluestore_volume_selection_policy
>>>     "bluestore_volume_selection_policy": "rocksdb_original",
>>> $ ceph crash info 2024-01-10T16:39:05.925534Z_f0c57ca3-b7e6-4511-b7ae-5834541d6c67 | egrep "(assert_condition|entity_name)"
>>>     "assert_condition": "cur >= p.length",
>>>     "entity_name": "osd.12",
>> 
>> So, I guess that configuration item doesn’t in fact prevent the crash as was purported.
>> Looks like I may need to fast track moving to quincy…
>> 
>> Reed
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux