Setting bluefs_buffered_io=true via restart of (all) OSDs didn’t change anything. But I made another observation: Once a week a lot of number of objects (and space) is reclaimed because of fstrim running inside the VMs. After this the latency is fine for about 12 hours or so and is then gradually getting worse. Here is the visualisation (it shows a magnitude latency drop for kv_final_lat): https://drive.google.com/file/d/1j4S4KXyZigRGKX-kng9KDU2QNv9qN3YF/view?usp=sharing <https://drive.google.com/file/d/1j4S4KXyZigRGKX-kng9KDU2QNv9qN3YF/view?usp=sharing> https://drive.google.com/file/d/1RTnQp8qeqiF04hBBjAZ5tw07_KBPElnq/view?usp=sharing Before the upgrade to 14.2.16 this did not happen: https://drive.google.com/file/d/1cUm2SaQ7XBmLwDPnUM4esrx-sbbnO-hi/view?usp=sharing My first impulse (maybe the new version has higher cache/space requirements) was to raise „osd memory target“ for one OSD to see if this has an effect. But this didn’t work. IMHO the increased latencies point to RocksDB. Maybe the default settings regarding RocksDB-caches do not fit anymore? Is anyone aware of changes between 14.2.8 and 14.2.16 regarding to RocksDB that could lead to this behaviour? Thanks for reading and support, Björn > Am 13.02.2021 um 09:42 schrieb Frank Schilder <frans@xxxxxx>: > > For comparison, a memory usage graph of a freshly deployed host with buffered_io=true: https://imgur.com/a/KUC2pio . Note the very rapid increase of buffer usage. > > OK, so you are using a self-made dashboard definition. I was hoping that people published something, I try to avoid starting from scratch. > > Best regards and good luck, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Björn Dolkemeier <b.dolkemeier@xxxxxxx> > Sent: 13 February 2021 09:33:12 > To: Frank Schilder > Cc: ceph-users@xxxxxxx > Subject: Re: Latency increase after upgrade 14.2.8 to 14.2.16 > > I will definitely follow your steps and apply bluefs_buffered_io=true via ceph.conf and restart. My first try was to update these dynamically. I’ll report when it’s done. > > We monitor our clusters via Telegraf (Ceph input Plugin) and InfluxDB and a custom Grafana dashboard fitted for our needs. > > Björn > >> Am 13.02.2021 um 09:23 schrieb Frank Schilder <frans@xxxxxx>: >> >> Ahh, OK. I'm not sure if it has that effect. What people observed was, that rocks-DB access became faster due to system buffer cache hits. This has an indirect influence on data access latency. >> >> The typical case is "high IOPs on WAL/DB device after upgrade" and setting bluefs_buffered_io=true got this back to normal also improving client performance as a result. >> >> Your latency graphs look actually suspiciously like it should work for you. Are you sure the OSD is using the value? I had problems with setting some parameters, I needed to include them in the ceph.conf file and restart to force them through. >> >> A sign that bluefs_buffered_io=true is applied is rapidly increasing system buffer usage reported by top or free. If the values reported are similar for all hosts, bluefs_buffered_io is still disabled. >> >> If I may ask, what framework are you using to pull these graphs? Is there a graphana dashboard one can download somewhere or is it something you implemented yourself? I plan to enable prometheus on our cluster, but don't know about a good data sink providing a pre-defined dashboard. >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Björn Dolkemeier <b.dolkemeier@xxxxxxx> >> Sent: 13 February 2021 08:51:11 >> To: Frank Schilder >> Cc: ceph-users@xxxxxxx >> Subject: Re: Latency increase after upgrade 14.2.8 to 14.2.16 >> >> Thanks for the quick reply, Frank. >> >> Sorry, the graphs/attachment where filtered. Here is an example of one latency: https://drive.google.com/file/d/1qSWmSmZ6JXVweepcoY13ofhfWXrBi2uZ/view?usp=sharing >> >> I’m aware that the overall performance depends on the slowest OSD. >> >> What I expect is that bluefs_buffered_io=true set on one OSD reflects in dropped latencies for that particular OSD. >> >> Best regards, >> Björn >> >> Am 13.02.2021 um 07:39 schrieb Frank Schilder <frans@xxxxxx<mailto:frans@xxxxxx>>: >> >> The graphs were forgotten or filtered out. >> >> Changing the buffered_io value on one host will not change client IO performance as its always the slowest OSD thats decisive. However, it should have an effect on the IOP/s load reported by iostat on the disks on the host. >> >> Does setting bluefs_buffered_io=true on all hosts have an effect on client IO? Note that it might need a restart even if the documentation says otherwise. >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Björn Dolkemeier <b.dolkemeier@xxxxxxx<mailto:b.dolkemeier@xxxxxxx>> >> Sent: 13 February 2021 07:16:06 >> To: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> >> Subject: Latency increase after upgrade 14.2.8 to 14.2.16 >> >> Hi, >> >> after upgrading Ceph from 14.2.8 to 14.2.16 we experienced increased latencies. There were no changes in hardware, configuration, workload or networking, just a rolling-update via ceph-ansible on running production cluster. The cluster consists of 16 OSDs (all SSD) over 4 Nodes. The VMs served via RBD from this cluster currently suffer on i/o wait cpu. >> >> These are some latencies that are increased after the update: >> - op_r_latency >> - op_w_latency >> - kv_final_lat >> - state_kv_commiting_lat >> - submit_lat >> - subop_w_latency >> >> Do these latencies point to KV/RocksDB? >> >> These are some latencies which are NOT increased after the update: >> - kv_sync_lat >> - kv_flush_lat >> - kv_commit_lat >> >> I attached one graph showing the massive increase after the update. >> >> I tried setting bluefs_buffered_io=true (as it’s default value was changed and it was mentioned as performance relevant) for all OSDs in one host but this does not make a difference. >> >> The ceph.conf is fairly simple: >> >> [global] >> cluster network = xxx >> fsid = xxx >> mon host = xxx >> public network = xxx >> >> [osd] >> osd memory target = 10141014425 >> >> Any ideas what to try? Help appreciated. >> >> Björn >> >> >> >> >> >> >> -- >> >> dbap GmbH >> phone +49 251 609979-0 / fax +49 251 609979-99 >> Heinr.-von-Kleist-Str. 47, 48161 Muenster, Germany >> http://www.dbap.de >> >> dbap GmbH, Sitz: Muenster >> HRB 5891, Amtsgericht Muenster >> Geschaeftsfuehrer: Bjoern Dolkemeier >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx