Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you, Igor.

It is somewhat disappointing that fixing this bug in Pacific has such a low
priority, considering its impact on existing clusters.

The document attached to the PR explicitly says about
`level_compaction_dynamic_level_bytes` that "enabling it on an existing DB
requires special caution", we'd rather not experiment with something that
has the potential to cause data corruption or loss in a production cluster.
Perhaps a downgrade to the previous version, 16.2.13 which worked for us
without any issues, is an option, or would you advise against such a
downgrade from 16.2.14?

/Z

On Fri, 20 Oct 2023 at 14:46, Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:

> Hi Zakhar,
>
> Definitely we expect one more (and apparently the last) Pacific minor
> release. There is no specific date yet though - the plans are to release
> Quincy and Reef minor releases prior to it. Hopefully to be done before the
> Christmas/New Year.
>
> Meanwhile you might want to workaround the issue by tuning
> bluestore_volume_selection_policy. Unfortunately most likely my original
> proposal to set it to rocksdb_original wouldn't work in this case so you
> better try "fit_to_fast" mode. This should be coupled with enabling
> 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty
> good spec on applying this mode to BlueStore attached to
> https://github.com/ceph/ceph/pull/37156.
>
>
> Thanks,
>
> Igor
> On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
>
> Igor, I noticed that there's no roadmap for the next 16.2.x release. May I
> ask what time frame we are looking at with regards to a possible fix?
>
> We're experiencing several OSD crashes caused by this issue per day.
>
> /Z
>
> On Mon, 16 Oct 2023 at 14:19, Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:
>
>> That's true.
>> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
>>
>> Many thanks, Igor. I found previously submitted bug reports and
>> subscribed to them. My understanding is that the issue is going to be fixed
>> in the next Pacific minor release.
>>
>> /Z
>>
>> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:
>>
>>> Hi Zakhar,
>>>
>>> please see my reply for the post on the similar issue at:
>>>
>>> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
>>>
>>>
>>> Thanks,
>>>
>>> Igor
>>>
>>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
>>> > Hi,
>>> >
>>> > After upgrading to Ceph 16.2.14 we had several OSD crashes
>>> > in bstore_kv_sync thread:
>>> >
>>> >
>>> >     1. "assert_thread_name": "bstore_kv_sync",
>>> >     2. "backtrace": [
>>> >     3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>>> >     4. "gsignal()",
>>> >     5. "abort()",
>>> >     6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> >     const*)+0x1a9) [0x564dc5f87d0b]",
>>> >     7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>>> >     8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
>>> >     const&)+0x15e) [0x564dc6604a9e]",
>>> >     9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>>> unsigned
>>> >     long)+0x77d) [0x564dc66951cd]",
>>> >     10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>>> >     [0x564dc6695670]",
>>> >     11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
>>> >     12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>>> >     13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>>> >     const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>>> >     14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>>> >     [0x564dc6c761c2]",
>>> >     15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
>>> [0x564dc6c77808]",
>>> >     16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>>> >     const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
>>> >     long)+0x309) [0x564dc6b780c9]",
>>> >     17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>>> >     rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*,
>>> unsigned
>>> >     long, bool, unsigned long*, unsigned long,
>>> >     rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>>> >     18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>>> >     rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>>> >     19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>>> >     std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84)
>>> [0x564dc6b1f644]",
>>> >     20.
>>> "(RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a)
>>> >     [0x564dc6b2004a]",
>>> >     21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
>>> >     22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
>>> >     23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>>> >     24. "clone()"
>>> >     25. ],
>>> >
>>> >
>>> > I am attaching two instances of crash info for further reference:
>>> > https://pastebin.com/E6myaHNU
>>> >
>>> > OSD configuration is rather simple and close to default:
>>> >
>>> > osd.6         dev       bluestore_cache_size_hdd            4294967296
>>> >                                            osd.6         dev
>>> > bluestore_cache_size_ssd            4294967296
>>> >                    osd           advanced  debug_rocksdb
>>> >    1/5
>>>  osd
>>> >          advanced  osd_max_backfills                   2
>>> >                                                  osd           basic
>>> > osd_memory_target                   17179869184
>>> >                      osd           advanced  osd_recovery_max_active
>>> >      2                                                             osd
>>> >      advanced  osd_scrub_sleep                     0.100000
>>> >                                        osd           advanced
>>> >   rbd_balance_parent_reads            false
>>> >
>>> > debug_rocksdb is a recent change, otherwise this configuration has been
>>> > running without issues for months. The crashes happened on two
>>> different
>>> > hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD
>>> > block) don't exhibit any issues. We have not experienced such crashes
>>> with
>>> > Ceph < 16.2.14.
>>> >
>>> > Is this a known issue, or should I open a bug report?
>>> >
>>> > Best regards,
>>> > Zakhar
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux