When the below was first published my team tried to reproduce, and couldn’t. A couple of factors likely contribute to differing behavior: * _Micron 5100_ for example isn’t a model, the 5100 _Eco_, _Pro_, and _Max_ are different beasts. Similarly, implementation and firmware details vary by drive _size_ as well. The moral of the story is to be careful extrapolating an experience with one specific drive to others models that one might assume are equivalent but aren’t. * For SAS/SATA drives the HBA in use may be significant factor as well — aad > > SSDs are not equal to high performance: https://yourcmc.ru/wiki/index.php?title=Ceph_performance&mobileaction=toggle_view_desktop > > Depending on your model, performance can be very poor. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Pascal Ehlert <pascal@xxxxxxxxxxxx> > Sent: 10 January 2021 19:19:09 > To: Frank Schilder > Cc: ceph-users@xxxxxxx > Subject: Re: Re: Snaptrim making cluster unusable > > I made the suggested changes. > > (Un)fortunately I am not able to reproduce the issue anymore. Neither > with the original settings nor the updated setting. > This may be due to the fact that the problematic snapshots have been > removed/trimmed now. > When I make new snapshots of the same volumes, they are (obviously) > trimmed in a few seconds without an impact on performance. > > I will try to reproduce this again by artificially boosting the snapshot > size. > > For now, would you mind explaining if and why disabling the write cache > is a good idea in general? > It feels that having too many layers of cache can be detrimental and I'd > leave it disable then. > > Thank you very much! > > Pascal > > > > Frank Schilder wrote on 10.01.21 18:56: >>>> - do you have bluefs_buffered_io set to true >>> No >> Try setting it to true. >> >>> Is there anything specific I can do to check the write cache configuration? >> Yes, "smartctl -g wcache DEVICE" will tell you if writeback cache is disabled. If not, use "smartctl -s wcache=off DEVICE" to disable it. Note that this setting does not persist reboot. You will find a discussion about how to do that in the list. >> >> With both changes, try to enable snaptrim again and report back. >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Pascal Ehlert <pascal@xxxxxxxxxxxx> >> Sent: 10 January 2021 18:47:40 >> To: Frank Schilder >> Cc: ceph-users@xxxxxxx >> Subject: Re: Snaptrim making cluster unusable >> >> Hi Frank, >> >> Thanks for getting back! >>> - ceph version >> 15.2.6 (now upgraded to 15.2.8 and I was able to reproduce the issue) >>> - rbd image config (meta- and data pool the same/different?) >> We are not using EC but regular replicated pools, so I assume meta and >> data pool are the same? >>> - how many PGs do the affected pools have >> 512 for a total of 20.95TB of data >>> - how many PGs per OSD (as stated by ceph osd df tree) >> Varying between ~80 to ~220 with the 4TB disks having roughly twice as >> many as the 2TB disks >>> - what type of SSDs, do they have power loss protection, is write cache disabled >> Mixed Intel SSDs, one example being Intel® SSD D3-S4510 Series >> If this becomes relevant, I can look up some of the exact model, but I >> couldn't pinpoint specific OSDs that struggled >> >> The disks are connected through standard Intel SATA controllers, >> sometimes onboard. >> Is there anything specific I can do to check the write cache configuration? >> >>> - do you have bluefs_buffered_io set to true >> No >> >> >> >> Regards, >> >> Pascal >> >> >> >>> For comparison, we are running daily rolling snapshots on ca. 260 RBD images with separate replicated meta-data and 6+2 EC data pool without any issues. No parameters changed from default. Version is mimic-13.2.10. >>> >>> >>> ================= >>> Frank Schilder >>> AIT Risø Campus >>> Bygning 109, rum S14 >>> >>> ________________________________________ >>> From: Pascal Ehlert <pascal@xxxxxxxxxxxx> >>> Sent: 10 January 2021 18:06:18 >>> To: ceph-users@xxxxxxx >>> Subject: Snaptrim making cluster unusable >>> >>> Hi all, >>> >>> We are running a small cluster with three nodes and 6-8 OSDs each. >>> The OSDs are SSDs with sizes from 2 to 4 TB. Crush map is configured so >>> all data is replicated to each node. >>> The Ceph version is Ceph 15.2.6. >>> >>> Today I deleted 4 Snapshots of the same two 400GB and 500GB rbd volumes. >>> Shortly after issuing the delete, I noticed the cluster became >>> unresponsive to an extend where almost all our services went down due >>> high IO latency. >>> >>> After a while, I noticed about 20 active snaptrim tasks + another 200 or >>> so snaptrim_wait. >>> >>> I tried setting >>> osd_snap_trim_sleep to 3, >>> osd_pg_max_concurrent_snap_trims to 1 >>> rbd_balance_snap_reads to true, >>> rbd_localize_snap_reads to true >>> >>> >>> Still the only way to make the cluster responsive again was to set >>> osd_pg_max_concurrent_snap_trims to 0 and thus disable snaptrimming. >>> I tried a few other options, but whenever snaptrims are running for a >>> significant number of PGs, the cluster becomes completely unusable. >>> >>> Are there any other options to throttle snaptrimming for that I haven't >>> tried, yet? >>> >>> >>> Thank you, >>> >>> Pascal >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx