Re: Snaptrim making cluster unusable

Frank Schilder <frans@xxxxxx> · Sun, 10 Jan 2021 17:33:10 +0000

Hi Pascal,

can you add a bit more information:

- ceph version
- rbd image config (meta- and data pool the same/different?)
- how many PGs do the affected pools have
- how many PGs per OSD (as stated by ceph osd df tree)
- what type of SSDs, do they have power loss protection, is write cache disabled
- do you have bluefs_buffered_io set to true

For comparison, we are running daily rolling snapshots on ca. 260 RBD images with separate replicated meta-data and 6+2 EC data pool without any issues. No parameters changed from default. Version is mimic-13.2.10.

=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Pascal Ehlert <pascal@xxxxxxxxxxxx>
Sent: 10 January 2021 18:06:18
To: ceph-users@xxxxxxx
Subject:  Snaptrim making cluster unusable

Hi all,

We are running a small cluster with three nodes and 6-8 OSDs each.
The OSDs are SSDs with sizes from 2 to 4 TB. Crush map is configured so
all data is replicated to each node.
The Ceph version is Ceph 15.2.6.

Today I deleted 4 Snapshots of the same two 400GB and 500GB rbd volumes.
Shortly after issuing the delete, I noticed the cluster became
unresponsive to an extend where almost all our services went down due
high IO latency.

After a while, I noticed about 20 active snaptrim tasks + another 200 or
so snaptrim_wait.

I tried setting
osd_snap_trim_sleep to 3,
osd_pg_max_concurrent_snap_trims to 1
rbd_balance_snap_reads to true,
rbd_localize_snap_reads to true

Still the only way to make the cluster responsive again was to set
osd_pg_max_concurrent_snap_trims to 0 and thus disable snaptrimming.
I tried a few other options, but whenever snaptrims are running for a
significant number of PGs, the cluster becomes completely unusable.

Are there any other options to throttle snaptrimming for that I haven't
tried, yet?

Thank you,

Pascal
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx