Re: Snaptrim issue after nautilus to octopus upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Özkan,

FWIW, we observed something similar after upgrading from Mimic => Nautilus => Octopus and starting to trim snapshots after.

The size of our cluster was a bit smaller, but the effect was the same: When snapshot trimming started, OSDs went into high load and RBD I/O was extremely slow.

We tried to use:
 ceph tell osd.* injectargs '--osd-snap-trim-sleep 10'
first, which helped, but of course snapshots kept piling up.

Finally, we performed only RocksDB compactions via:

 for A in {0..5}; do ceph tell osd.$A compact | sed 's/^/'$A': /' & done

for some batches of OSDs, and their load went down heavily. Finally, after we'd churned through all OSDs, I/O load was low again, and we could go back to the default:
 ceph tell osd.* injectargs '--osd-snap-trim-sleep 0'

After this, the situation has stabilized for us. So my guess would be that the RocksDBs grew too much after the OMAP format conversion and the compaction shrank them again.

Maybe that also helps in your case?

Interestingly, we did not observe this on other clusters (one mainly for CephFS, another one with mirrored RBD volumes), which took the same upgrade path.

Cheers,
	Oliver

Am 23.08.24 um 16:46 schrieb Özkan Göksu:
Hello folks.

We have a ceph cluster and we have 2000+ RBD drives on 20 nodes.

We upgraded the cluster from 14.2.16 to 15.2.14 and after the upgrade we
started to see snap trim issues.
Without the "nosnaptrim" flag, the system is not usable right now.

I think the problem is because of the omap conversion at Octopus upgrade.

Note that the first time each OSD starts, it will do a format conversion to
improve the accounting for “omap” data. This may take a few minutes to as
much as a few hours (for an HDD with lots of omap data). You can disable
this automatic conversion with:

What should I do to solve this problem?

Thanks.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Oliver Freyermuth
Universität Bonn
Physikalisches Institut, Raum 1.047
Nußallee 12
53115 Bonn
--
Tel.: +49 228 73 2367
Fax:  +49 228 73 7869
--

Attachment: smime.p7s
Description: Kryptografische S/MIME-Signatur

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux