Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks, Tyler. I appreciate what you're saying, though I can't fully agree:
16.2.13 didn't have crashing OSDs, so the crashes in 16.2.14 seem like a
regression - please correct me if I'm wrong. If it is indeed a regression,
then I'm not sure that suggesting to upgrade is the right thing to do in
this case.

We would consider upgrading, but unfortunately our Openstack Wallaby is
holding us back as its cinder doesn't support Ceph 17.x, so we're stuck
with having to find a solution for Ceph 16.x.

/Z

On Fri, 20 Oct 2023 at 15:39, Tyler Stachecki <stachecki.tyler@xxxxxxxxx>
wrote:

> On Fri, Oct 20, 2023, 8:11 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote:
>
>> Thank you, Igor.
>>
>> It is somewhat disappointing that fixing this bug in Pacific has such a
>> low
>> priority, considering its impact on existing clusters.
>>
>
> Unfortunately, the hard truth here is that Pacific (stable) was released
> over 30 months ago. It has had a good run for a freely distributed product,
> and there's only so much time you can dedicate to backporting bugfixes --
> it claws time away from other forward-thinking initiatives.
>
> Speaking from someone who's been at the helm of production clusters, I
> know Ceph upgrades can be an experience and it's frustrating to hear, but
> you have to jump sometime...
>
> Regards,
> Tyler
>
>
>> On Fri, 20 Oct 2023 at 14:46, Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:
>>
>> > Hi Zakhar,
>> >
>> > Definitely we expect one more (and apparently the last) Pacific minor
>> > release. There is no specific date yet though - the plans are to release
>> > Quincy and Reef minor releases prior to it. Hopefully to be done before
>> the
>> > Christmas/New Year.
>> >
>> > Meanwhile you might want to workaround the issue by tuning
>> > bluestore_volume_selection_policy. Unfortunately most likely my original
>> > proposal to set it to rocksdb_original wouldn't work in this case so you
>> > better try "fit_to_fast" mode. This should be coupled with enabling
>> > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty
>> > good spec on applying this mode to BlueStore attached to
>> > https://github.com/ceph/ceph/pull/37156.
>> >
>> >
>> > Thanks,
>> >
>> > Igor
>> > On 20/10/2023 06:03, Zakhar Kirpichenko wrote:
>> >
>> > Igor, I noticed that there's no roadmap for the next 16.2.x release.
>> May I
>> > ask what time frame we are looking at with regards to a possible fix?
>> >
>> > We're experiencing several OSD crashes caused by this issue per day.
>> >
>> > /Z
>> >
>> > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov <igor.fedotov@xxxxxxxx>
>> wrote:
>> >
>> >> That's true.
>> >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote:
>> >>
>> >> Many thanks, Igor. I found previously submitted bug reports and
>> >> subscribed to them. My understanding is that the issue is going to be
>> fixed
>> >> in the next Pacific minor release.
>> >>
>> >> /Z
>> >>
>> >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov <igor.fedotov@xxxxxxxx>
>> wrote:
>> >>
>> >>> Hi Zakhar,
>> >>>
>> >>> please see my reply for the post on the similar issue at:
>> >>>
>> >>>
>> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
>> >>>
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Igor
>> >>>
>> >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
>> >>> > Hi,
>> >>> >
>> >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes
>> >>> > in bstore_kv_sync thread:
>> >>> >
>> >>> >
>> >>> >     1. "assert_thread_name": "bstore_kv_sync",
>> >>> >     2. "backtrace": [
>> >>> >     3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
>> >>> >     4. "gsignal()",
>> >>> >     5. "abort()",
>> >>> >     6. "(ceph::__ceph_assert_fail(char const*, char const*, int,
>> char
>> >>> >     const*)+0x1a9) [0x564dc5f87d0b]",
>> >>> >     7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
>> >>> >     8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
>> bluefs_fnode_t
>> >>> >     const&)+0x15e) [0x564dc6604a9e]",
>> >>> >     9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long,
>> >>> unsigned
>> >>> >     long)+0x77d) [0x564dc66951cd]",
>> >>> >     10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
>> >>> >     [0x564dc6695670]",
>> >>> >     11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
>> [0x564dc66b1a6b]",
>> >>> >     12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
>> >>> >     13.
>> "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
>> >>> >     const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
>> >>> >     14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
>> >>> >     [0x564dc6c761c2]",
>> >>> >     15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
>> >>> [0x564dc6c77808]",
>> >>> >     16.
>> "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
>> >>> >     const&, rocksdb::log::Writer*, unsigned long*, bool, bool,
>> unsigned
>> >>> >     long)+0x309) [0x564dc6b780c9]",
>> >>> >     17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
>> >>> >     rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*,
>> >>> unsigned
>> >>> >     long, bool, unsigned long*, unsigned long,
>> >>> >     rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
>> >>> >     18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
>> >>> >     rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
>> >>> >     19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
>> >>> >     std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84)
>> >>> [0x564dc6b1f644]",
>> >>> >     20.
>> >>>
>> "(RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a)
>> >>> >     [0x564dc6b2004a]",
>> >>> >     21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
>> >>> >     22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
>> >>> >     23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
>> >>> >     24. "clone()"
>> >>> >     25. ],
>> >>> >
>> >>> >
>> >>> > I am attaching two instances of crash info for further reference:
>> >>> > https://pastebin.com/E6myaHNU
>> >>> >
>> >>> > OSD configuration is rather simple and close to default:
>> >>> >
>> >>> > osd.6         dev       bluestore_cache_size_hdd
>> 4294967296
>> >>> >                                            osd.6         dev
>> >>> > bluestore_cache_size_ssd            4294967296
>> >>> >                    osd           advanced  debug_rocksdb
>> >>> >    1/5
>> >>>  osd
>> >>> >          advanced  osd_max_backfills                   2
>> >>> >                                                  osd           basic
>> >>> > osd_memory_target                   17179869184
>> >>> >                      osd           advanced  osd_recovery_max_active
>> >>> >      2
>>  osd
>> >>> >      advanced  osd_scrub_sleep                     0.100000
>> >>> >                                        osd           advanced
>> >>> >   rbd_balance_parent_reads            false
>> >>> >
>> >>> > debug_rocksdb is a recent change, otherwise this configuration has
>> been
>> >>> > running without issues for months. The crashes happened on two
>> >>> different
>> >>> > hosts with identical hardware, the hosts and storage (NVME DB/WAL,
>> HDD
>> >>> > block) don't exhibit any issues. We have not experienced such
>> crashes
>> >>> with
>> >>> > Ceph < 16.2.14.
>> >>> >
>> >>> > Is this a known issue, or should I open a bug report?
>> >>> >
>> >>> > Best regards,
>> >>> > Zakhar
>> >>> > _______________________________________________
>> >>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> >>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >>>
>> >>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux