Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 20, 2023, 8:51 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote:

>
> We would consider upgrading, but unfortunately our Openstack Wallaby is
> holding us back as its cinder doesn't support Ceph 17.x, so we're stuck
> with having to find a solution for Ceph 16.x.
>

Wallaby is also quite old at this time... are you aware that the W release
of Cinder has foregone backport the critical, high-profile CVE-2023-2088
due to it's age?
https://github.com/openstack/cinder/commit/2fef6c41fa8c5ea772cde227a119dcf22ce7a07d

There was some tension over this at OpenInfra Summit this past year between
some of the operators and developers. Wallaby is still marked EM upstream,
but even so did not get this patch.

The story is, unfortunately, the same here: the only way out of some of
these holes is to upgrade...

Regards,
Tyler


>
> On Fri, 20 Oct 2023 at 15:39, Tyler Stachecki <stachecki.tyler@xxxxxxxxx>
> wrote:
>
>> On Fri, Oct 20, 2023, 8:11 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx>
>> wrote:
>>
>>> Thank you, Igor.
>>>
>>> It is somewhat disappointing that fixing this bug in Pacific has such a
>>> low
>>> priority, considering its impact on existing clusters.
>>>
>>
>> Unfortunately, the hard truth here is that Pacific (stable) was released
>> over 30 months ago. It has had a good run for a freely distributed product,
>> and there's only so much time you can dedicate to backporting bugfixes --
>> it claws time away from other forward-thinking initiatives.
>>
>> Speaking from someone who's been at the helm of production clusters, I
>> know Ceph upgrades can be an experience and it's frustrating to hear, but
>> you have to jump sometime...
>>
>> Regards,
>> Tyler
>>
>>
>>> On Fri, 20 Oct 2023 at 14:46, Igor Fedotov <igor.fedotov@xxxxxxxx>
>>> wrote:
>>>
>>> > Hi Zakhar,
>>> >
>>> > Definitely we expect one more (and apparently the last) Pacific minor
>>> > release. There is no specific date yet though - the plans are to
>>> release
>>> > Quincy and Reef minor releases prior to it. Hopefully to be done
>>> before the
>>> > Christmas/New Year.
>>> >
>>> > Meanwhile you might want to workaround the issue by tuning
>>> > bluestore_volume_selection_policy. Unfortunately most likely my
>>> original
>>> > proposal to set it to rocksdb_original wouldn't work in this case so
>>> you
>>> > better try "fit_to_fast" mode. This should be coupled with enabling
>>> > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is
>>> pretty
>>> > good spec on applying this mode to BlueStore attached to
>>> > https://github.com/ceph/ceph/pull/37156.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Igor
>>> > On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
>>> >
>>> > Igor, I noticed that there's no roadmap for the next 16.2.x release.
>>> May I
>>> > ask what time frame we are looking at with regards to a possible fix?
>>> >
>>> > We're experiencing several OSD crashes caused by this issue per day.
>>> >
>>> > /Z
>>> >
>>> > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov <igor.fedotov@xxxxxxxx>
>>> wrote:
>>> >
>>> >> That's true.
>>> >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
>>> >>
>>> >> Many thanks, Igor. I found previously submitted bug reports and
>>> >> subscribed to them. My understanding is that the issue is going to be
>>> fixed
>>> >> in the next Pacific minor release.
>>> >>
>>> >> /Z
>>> >>
>>> >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov <igor.fedotov@xxxxxxxx>
>>> wrote:
>>> >>
>>> >>> Hi Zakhar,
>>> >>>
>>> >>> please see my reply for the post on the similar issue at:
>>> >>>
>>> >>>
>>> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
>>> >>>
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>> Igor
>>> >>>
>>> >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
>>> >>> > Hi,
>>> >>> >
>>> >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes
>>> >>> > in bstore_kv_sync thread:
>>> >>> >
>>> >>> >
>>> >>> >     1. "assert_thread_name": "bstore_kv_sync",
>>> >>> >     2. "backtrace": [
>>> >>> >     3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>>> >>> >     4. "gsignal()",
>>> >>> >     5. "abort()",
>>> >>> >     6. "(ceph::__ceph_assert_fail(char const*, char const*, int,
>>> char
>>> >>> >     const*)+0x1a9) [0x564dc5f87d0b]",
>>> >>> >     7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>>> >>> >     8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
>>> bluefs_fnode_t
>>> >>> >     const&)+0x15e) [0x564dc6604a9e]",
>>> >>> >     9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>>> >>> unsigned
>>> >>> >     long)+0x77d) [0x564dc66951cd]",
>>> >>> >     10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>>> >>> >     [0x564dc6695670]",
>>> >>> >     11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
>>> [0x564dc66b1a6b]",
>>> >>> >     12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>>> >>> >     13.
>>> "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>>> >>> >     const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>>> >>> >     14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>>> >>> >     [0x564dc6c761c2]",
>>> >>> >     15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
>>> >>> [0x564dc6c77808]",
>>> >>> >     16.
>>> "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>>> >>> >     const&, rocksdb::log::Writer*, unsigned long*, bool, bool,
>>> unsigned
>>> >>> >     long)+0x309) [0x564dc6b780c9]",
>>> >>> >     17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>>> >>> >     rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*,
>>> >>> unsigned
>>> >>> >     long, bool, unsigned long*, unsigned long,
>>> >>> >     rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>>> >>> >     18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>>> >>> >     rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>>> >>> >     19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>>> >>> >     std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84)
>>> >>> [0x564dc6b1f644]",
>>> >>> >     20.
>>> >>>
>>> "(RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a)
>>> >>> >     [0x564dc6b2004a]",
>>> >>> >     21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
>>> >>> >     22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
>>> >>> >     23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>>> >>> >     24. "clone()"
>>> >>> >     25. ],
>>> >>> >
>>> >>> >
>>> >>> > I am attaching two instances of crash info for further reference:
>>> >>> > https://pastebin.com/E6myaHNU
>>> >>> >
>>> >>> > OSD configuration is rather simple and close to default:
>>> >>> >
>>> >>> > osd.6         dev       bluestore_cache_size_hdd
>>> 4294967296
>>> >>> >                                            osd.6         dev
>>> >>> > bluestore_cache_size_ssd            4294967296
>>> >>> >                    osd           advanced  debug_rocksdb
>>> >>> >    1/5
>>> >>>  osd
>>> >>> >          advanced  osd_max_backfills                   2
>>> >>> >                                                  osd
>>>  basic
>>> >>> > osd_memory_target                   17179869184
>>> >>> >                      osd           advanced
>>> osd_recovery_max_active
>>> >>> >      2
>>>  osd
>>> >>> >      advanced  osd_scrub_sleep                     0.100000
>>> >>> >                                        osd           advanced
>>> >>> >   rbd_balance_parent_reads            false
>>> >>> >
>>> >>> > debug_rocksdb is a recent change, otherwise this configuration has
>>> been
>>> >>> > running without issues for months. The crashes happened on two
>>> >>> different
>>> >>> > hosts with identical hardware, the hosts and storage (NVME DB/WAL,
>>> HDD
>>> >>> > block) don't exhibit any issues. We have not experienced such
>>> crashes
>>> >>> with
>>> >>> > Ceph < 16.2.14.
>>> >>> >
>>> >>> > Is this a known issue, or should I open a bug report?
>>> >>> >
>>> >>> > Best regards,
>>> >>> > Zakhar
>>> >>> > _______________________________________________
>>> >>> > ceph-users mailing list -- ceph-users@xxxxxxx
>>> >>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> >>>
>>> >>
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux