Hi, i just checked and all OSDs have it set to true. It seems also not a problem with the snaptrim opration. We just had two times in the last 7 days where nearly all OSDs logged a lot (around 3k times in 20 minutes) of these messages: 022-09-12T20:27:19.146+0200 7f576de49700 -1 osd.9 786378 get_health_metrics reporting 1 slow ops, oldest is osd_op(client.153241560.0:42288714 8.56 8:6a19e4ee:::rbd_data.4c64dc3662fb05.0000000000000c00:head [write 2162688~4096 in=4096b] snapc 9835e=[] ondisk+write+known_if_redirected e786375) Am Di., 13. Sept. 2022 um 20:20 Uhr schrieb Wesley Dillingham < wes@xxxxxxxxxxxxxxxxx>: > I haven't read through this entire thread so forgive me if already > mentioned: > > What is the parameter "bluefs_buffered_io" set to on your OSDs? We once > saw a terrible slowdown on our OSDs during snaptrim events and setting > bluefs_buffered_io to true alleviated that issue. That was on a nautilus > cluster. > > Respectfully, > > *Wes Dillingham* > wes@xxxxxxxxxxxxxxxxx > LinkedIn <http://www.linkedin.com/in/wesleydillingham> > > > On Tue, Sep 13, 2022 at 10:48 AM Boris Behrens <bb@xxxxxxxxx> wrote: > >> The cluster is SSD only with 2TB,4TB and 8TB disks. I would expect that >> this should be done fairly fast. >> For now I will recreate every OSD in the cluster and check if this helps. >> >> Do you experience slow OPS (so the cluster shows a message like "cluster >> [WRN] Health check update: 679 slow ops, oldest one blocked for 95 sec, >> daemons >> >> [osd.0,osd.106,osd.107,osd.108,osd.113,osd.116,osd.123,osd.124,osd.125,osd.134]... >> have slow ops. (SLOW_OPS)")? >> >> I can also see a huge spike in the load of all hosts in our cluster for a >> couple of minutes. >> >> >> Am Di., 13. Sept. 2022 um 13:14 Uhr schrieb Frank Schilder <frans@xxxxxx >> >: >> >> > Hi Boris. >> > >> > > 3. wait some time (took around 5-20 minutes) >> > >> > Sounds short. Might just have been the compaction that the OSDs do any >> > ways on startup after upgrade. I don't know how to check for completed >> > format conversion. What I see in your MON log is exactly what I have >> seen >> > with default snap trim settings until all OSDs were converted. Once an >> OSD >> > falls behind and slow ops start piling up, everything comes to a halt. >> Your >> > logs clearly show a sudden drop of IOP/s on snap trim start and I would >> > guess this is the cause of the slowly growing OPS back log of the OSDs. >> > >> > If its not that, I don't know what else to look for. >> > >> > Best regards, >> > ================= >> > Frank Schilder >> > AIT Risø Campus >> > Bygning 109, rum S14 >> > >> > ________________________________________ >> > From: Boris Behrens <bb@xxxxxxxxx> >> > Sent: 13 September 2022 12:58:19 >> > To: Frank Schilder >> > Cc: ceph-users@xxxxxxx >> > Subject: Re: laggy OSDs and staling krbd IO after upgrade >> > from nautilus to octopus >> > >> > Hi Frank, >> > we converted the OSDs directly on the upgrade. >> > >> > 1. installing new ceph versions >> > 2. restart all OSD daemons >> > 3. wait some time (took around 5-20 minutes) >> > 4. all OSDs were online again. >> > >> > So I would expect, that the OSDs are all upgraded correctly. >> > I also checked when the trimming happens, and it does not seem to be an >> > issue on it's own, as the trim happens all the time in various sizes. >> > >> > Am Di., 13. Sept. 2022 um 12:45 Uhr schrieb Frank Schilder < >> frans@xxxxxx >> > <mailto:frans@xxxxxx>>: >> > Are you observing this here: >> > >> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/LAN6PTZ2NHF2ZHAYXZIQPHZ4CMJKMI5K/ >> > ================= >> > Frank Schilder >> > AIT Risø Campus >> > Bygning 109, rum S14 >> > >> > ________________________________________ >> > From: Boris Behrens <bb@xxxxxxxxx<mailto:bb@xxxxxxxxx>> >> > Sent: 13 September 2022 11:43:20 >> > To: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> >> > Subject: laggy OSDs and staling krbd IO after upgrade from >> > nautilus to octopus >> > >> > Hi, I need you help really bad. >> > >> > we are currently experiencing a very bad cluster hangups that happen >> > sporadic. (once on 2022-09-08 mid day (48 hrs after the upgrade) and >> once >> > 2022-09-12 in the evening) >> > We use krbd without cephx for the qemu clients and when the OSDs are >> > getting laggy, the krbd connection comes to a grinding halt, to a point >> > that all IO is staling and we can't even unmap the rbd device. >> > >> > From the logs, it looks like that the cluster starts to snaptrim a lot a >> > PGs, then PGs become laggy and then the cluster snowballs into laggy >> OSDs. >> > I have attached the monitor log and the osd log (from one OSD) around >> the >> > time where it happened. >> > >> > - is this a known issue? >> > - what can I do to debug it further? >> > - can I downgrade back to nautilus? >> > - should I upgrade the PGs for the pool to 4096 or 8192? >> > >> > The cluster contains a mixture of 2,4 and 8TB SSDs (no rotating disks) >> > where the 8TB disks got ~120PGs and the 2TB disks got ~30PGs. All hosts >> > have a minimum of 128GB RAM and the kernel logs of all ceph hosts do not >> > show anything for the timeframe. >> > >> > Cluster stats: >> > cluster: >> > id: 74313356-3b3d-43f3-bce6-9fb0e4591097 >> > health: HEALTH_OK >> > >> > services: >> > mon: 3 daemons, quorum ceph-rbd-mon4,ceph-rbd-mon5,ceph-rbd-mon6 >> (age >> > 25h) >> > mgr: ceph-rbd-mon5(active, since 4d), standbys: ceph-rbd-mon4, >> > ceph-rbd-mon6 >> > osd: 149 osds: 149 up (since 6d), 149 in (since 7w) >> > >> > data: >> > pools: 4 pools, 2241 pgs >> > objects: 25.43M objects, 82 TiB >> > usage: 231 TiB used, 187 TiB / 417 TiB avail >> > pgs: 2241 active+clean >> > >> > io: >> > client: 211 MiB/s rd, 273 MiB/s wr, 1.43k op/s rd, 8.80k op/s wr >> > >> > --- RAW STORAGE --- >> > CLASS SIZE AVAIL USED RAW USED %RAW USED >> > ssd 417 TiB 187 TiB 230 TiB 231 TiB 55.30 >> > TOTAL 417 TiB 187 TiB 230 TiB 231 TiB 55.30 >> > >> > --- POOLS --- >> > POOL ID PGS STORED OBJECTS USED %USED MAX >> > AVAIL >> > isos 7 64 455 GiB 117.92k 1.3 TiB 1.17 38 >> > TiB >> > rbd 8 2048 76 TiB 24.65M 222 TiB 66.31 38 >> > TiB >> > archive 9 128 2.4 TiB 669.59k 7.3 TiB 6.06 38 >> > TiB >> > device_health_metrics 10 1 25 MiB 149 76 MiB 0 38 >> > TiB >> > >> > >> > >> > -- >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >> > groüen Saal. >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx >> > >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto: >> > ceph-users-leave@xxxxxxx> >> > >> > >> > -- >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >> > groüen Saal. >> > >> >> >> -- >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >> groüen Saal. >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx