Re: quincy 17.2.6 - write performance continuously slowing down until OSD restart needed

Nikola Ciprich <nikola.ciprich@xxxxxxxxxxx> · Tue, 2 May 2023 20:02:23 +0200

Hello Igor,

On Tue, May 02, 2023 at 05:41:04PM +0300, Igor Fedotov wrote:
> Hi Nikola,
> 
> I'd suggest to start monitoring perf counters for your osds.
> op_w_lat/subop_w_lat ones specifically. I presume they raise eventually,
> don't they?
OK, starting collecting those for all OSDs..

currently values for avgtime are around 0.0003 for subop_w_lat and 0.001-0.002
for op_w_lat

I guess it'll need some time to find some trend, so I'll check tomorrow

> 
> Does subop_w_lat grow for every OSD or just a subset of them? How large is
> the delta between the best and the worst OSDs after a one week period? How
> many "bad" OSDs are at this point?
I'll see and report

> 
> 
> And some more questions:
> 
> How large are space utilization/fragmentation for your OSDs?
OSD usage is around 16-18%. fragmentation should not be very bad, this
cluster is deployed for few months only

> 
> Is the same performance drop observed for artificial benchmarks, e.g. 4k
> random writes to a fresh RBD image using fio?
will check again when the slowdown occurs and report

> 
> Is there any RAM utilization growth for OSD processes over time? Or may be
> any suspicious growth in mempool stats?
nope, RAM usage seems to be pretty constant.

hewever, probably worh noting, historically we're using following OSD options:
ceph config set osd bluestore_rocksdb_options compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB
ceph config set osd bluestore_cache_autotune 0
ceph config set osd bluestore_cache_size_ssd 2G
ceph config set osd bluestore_cache_kv_ratio 0.2
ceph config set osd bluestore_cache_meta_ratio 0.8
ceph config set osd osd_min_pg_log_entries 10
ceph config set osd osd_max_pg_log_entries 10
ceph config set osd osd_pg_log_dups_tracked 10
ceph config set osd osd_pg_log_trim_min 10

so maybe I'll start resetting those to defaults (ie enabling cache autotune etc)
as a first step..

> 
> 
> As a blind and brute force approach you might also want to compact RocksDB
> through ceph-kvstore-tool and switch bluestore allocator to bitmap
> (presuming default hybrid one is effective right now). Please do one
> modification at a time to realize what action is actually helpful if any.
will do..

thanks again for your hints

BR

nik

> 
> 
> Thanks,
> 
> Igor
> 
> On 5/2/2023 11:32 AM, Nikola Ciprich wrote:
> > Hello dear CEPH users and developers,
> > 
> > we're dealing with strange problems.. we're having 12 node alma linux 9 cluster,
> > initially installed CEPH 15.2.16, then upgraded to 17.2.5. It's running bunch
> > of KVM virtual machines accessing volumes using RBD.
> > 
> > everything is working well, but there is strange and for us quite serious issue
> >   - speed of write operations (both sequential and random) is constantly degrading
> >   drastically to almost unusable numbers (in ~1week it drops from ~70k 4k writes/s
> >   from 1 VM  to ~7k writes/s)
> > 
> > When I restart all OSD daemons, numbers immediately return to normal..
> > 
> > volumes are stored on replicated pool of 4 replicas, on top of 7*12 = 84
> > INTEL SSDPE2KX080T8 NVMEs.
> > 
> > I've updated cluster to 17.2.6 some time ago, but the problem persists. This is
> > especially annoying in connection with https://tracker.ceph.com/issues/56896
> > as restarting OSDs is quite painfull when half of them crash..
> > 
> > I don't see anything suspicious, nodes load is quite low, no logs errors,
> > network latency and throughput is OK too
> > 
> > Anyone having simimar issue?
> > 
> > I'd like to ask for hints on what should I check further..
> > 
> > we're running lots of 14.2.x and 15.2.x clusters, none showing similar
> > issue, so I'm suspecting this is something related to quincy
> > 
> > thanks a lot in advance
> > 
> > with best regards
> > 
> > nikola ciprich
> > 
> > 
> > 
> -- 
> Igor Fedotov
> Ceph Lead Developer
> 
> Looking for help with your Ceph cluster? Contact us at https://croit.io
> 
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
> 

-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@xxxxxxxxxxx
-------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx