Re: After upgrading from 17.2.6 to 18.2.0, OSDs are very frequently restarting due to livenessprobe failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sudhin,

It looks like manual DB compactions are (periodically?) issued via admin socket for your OSDs, which (my working hypothesis) triggers DB access stalls.

Here are the log lines indicating such calls

debug 2023-09-22T11:24:55.234+0000 7fc4efa20700  1 osd.1 1192508 triggering manual compaction

debug 2023-09-21T15:35:22.696+0000 7faf22c8b700  1 osd.2 1180406 finished manual compaction in 722.287 seconds

So I'm curious if you do have some external stuff performing manual OSD compactions? If so - would the primary issue go away when it's disabled?

You might want to disable it cluster wide and let OSDs run after that for a while to make sure that's the case. Then try to reproduce it again by manual running compaction for a specific OSD via CLI. Would it fail again?


If the above hypotheses is confirmed I could see two potential root causes:

1. Hybrid allocator might cause severe BlueFS stalls which make irresponsive. See https://tracker.ceph.com/issues/62815

2. Default RocksDB settings were changed in Reef. See https://github.com/ceph/ceph/pull/51900


The easiest way to verify if you're facing 1. is to set bluestore_allocator to bitmap for all the OSDs (and restart them) via "ceph config set" command .  Then monitor OSDs behavior during manual compactions.

For validating 2.  one should revert bluestore_rocksdb_options back to the original value "compression=kNoCompression,max_write_buffer_number=64,min_write_buffer_number_to_merge=6,compaction_style=kCompactionStyleLevel,write_buffer_size=16777216,max_background_jobs=4,level0_file_num_compaction_trigger=8,max_bytes_for_level_base=1073741824,max_bytes_for_level_multiplier=8,compaction_readahead_size=2MB,max_total_wal_size=1073741824,writable_file_max_buffer_size=0"

I'd recommend to do that for a single OSD first. Just in case - we don't have much knowledge on how OSDs survive such a reversion hence better/safier do that gradually.


Hope this helps and awaiting for the feedback.

Thanks,

Igor

On 27/09/2023 22:04, sbengeri@xxxxxxxxx wrote:
Hi Igor,

I have copied three OSD logs to
https://drive.google.com/file/d/1aQxibFJR6Dzvr3RbuqnpPhaSMhPSL--F/view?usp=sharing

Hopefully they include some meaningful information.

Thank you.

Sudhin
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux