Hello Igor, On Tue, May 02, 2023 at 05:41:04PM +0300, Igor Fedotov wrote: > Hi Nikola, > > I'd suggest to start monitoring perf counters for your osds. > op_w_lat/subop_w_lat ones specifically. I presume they raise eventually, > don't they? OK, starting collecting those for all OSDs.. currently values for avgtime are around 0.0003 for subop_w_lat and 0.001-0.002 for op_w_lat I guess it'll need some time to find some trend, so I'll check tomorrow > > Does subop_w_lat grow for every OSD or just a subset of them? How large is > the delta between the best and the worst OSDs after a one week period? How > many "bad" OSDs are at this point? I'll see and report > > > And some more questions: > > How large are space utilization/fragmentation for your OSDs? OSD usage is around 16-18%. fragmentation should not be very bad, this cluster is deployed for few months only > > Is the same performance drop observed for artificial benchmarks, e.g. 4k > random writes to a fresh RBD image using fio? will check again when the slowdown occurs and report > > Is there any RAM utilization growth for OSD processes over time? Or may be > any suspicious growth in mempool stats? nope, RAM usage seems to be pretty constant. hewever, probably worh noting, historically we're using following OSD options: ceph config set osd bluestore_rocksdb_options compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB ceph config set osd bluestore_cache_autotune 0 ceph config set osd bluestore_cache_size_ssd 2G ceph config set osd bluestore_cache_kv_ratio 0.2 ceph config set osd bluestore_cache_meta_ratio 0.8 ceph config set osd osd_min_pg_log_entries 10 ceph config set osd osd_max_pg_log_entries 10 ceph config set osd osd_pg_log_dups_tracked 10 ceph config set osd osd_pg_log_trim_min 10 so maybe I'll start resetting those to defaults (ie enabling cache autotune etc) as a first step.. > > > As a blind and brute force approach you might also want to compact RocksDB > through ceph-kvstore-tool and switch bluestore allocator to bitmap > (presuming default hybrid one is effective right now). Please do one > modification at a time to realize what action is actually helpful if any. will do.. thanks again for your hints BR nik > > > Thanks, > > Igor > > On 5/2/2023 11:32 AM, Nikola Ciprich wrote: > > Hello dear CEPH users and developers, > > > > we're dealing with strange problems.. we're having 12 node alma linux 9 cluster, > > initially installed CEPH 15.2.16, then upgraded to 17.2.5. It's running bunch > > of KVM virtual machines accessing volumes using RBD. > > > > everything is working well, but there is strange and for us quite serious issue > > - speed of write operations (both sequential and random) is constantly degrading > > drastically to almost unusable numbers (in ~1week it drops from ~70k 4k writes/s > > from 1 VM to ~7k writes/s) > > > > When I restart all OSD daemons, numbers immediately return to normal.. > > > > volumes are stored on replicated pool of 4 replicas, on top of 7*12 = 84 > > INTEL SSDPE2KX080T8 NVMEs. > > > > I've updated cluster to 17.2.6 some time ago, but the problem persists. This is > > especially annoying in connection with https://tracker.ceph.com/issues/56896 > > as restarting OSDs is quite painfull when half of them crash.. > > > > I don't see anything suspicious, nodes load is quite low, no logs errors, > > network latency and throughput is OK too > > > > Anyone having simimar issue? > > > > I'd like to ask for hints on what should I check further.. > > > > we're running lots of 14.2.x and 15.2.x clusters, none showing similar > > issue, so I'm suspecting this is something related to quincy > > > > thanks a lot in advance > > > > with best regards > > > > nikola ciprich > > > > > > > -- > Igor Fedotov > Ceph Lead Developer > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH, Freseniusstr. 31h, 81247 Munich > CEO: Martin Verges - VAT-ID: DE310638492 > Com. register: Amtsgericht Munich HRB 231263 > Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx > -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@xxxxxxxxxxx ------------------------------------- _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx