Hi,
one of our customers just recently tried that in a non-production
cluster (Nautilus) and it was horrible. They wanted to test a couple
of failure resistency scenarios and needed to purge old data (I
believe around 200 TB or so), we ended up recreating the OSDs instead
of waiting for the deletion to finish. They have HDDs with separate
rocksDB on SSDs (ratio 1:5) and the SSDs were completely saturated
according to iostat.
It's a known issue (see [1], [2]), I'll just quote Igor:
there are two issues with massive data removals
1) It might take pretty long time to complete since it performs 15
object removals per second per PG/OSD. Hence it might takes days
2)RocksDB starts to perform poorly from performance point of view
after (or even during) such removals
As far as I can see the backports for the fix are pending for Mimic,
Nautilus and Octopus.
Regards,
Eugen
[1] https://tracker.ceph.com/issues/45765
[2] https://tracker.ceph.com/issues/47044
Zitat von Scheurer François <francois.scheurer@xxxxxxxxxxxx>:
Hi everybody
Does somebody had experience with important performance degradations during
a pool deletion?
We are asking because we are going to delete a 370 TiB with 120 M
objects and have never done this in the past.
The pool is using erasure coding 8+2 on nvme ssd's with rocksdb/wal
on nvme optane disks.
Openstack VM's are running on the other rbd pools.
Thank you in advance for your feedback!
Cheers
Francois
PS:
this is no option, as it take about 100 year to complete ;-) :
rados -p ch-zh1-az1.rgw.buckets.data ls | while read i; do rados -p
ch-zh1-az1.rgw.buckets.data rm "$i"; done
--
EveryWare AG
François Scheurer
Senior Systems Engineer
Zurlindenstrasse 52a
CH-8003 Zürich
tel: +41 44 466 60 00
fax: +41 44 466 60 10
mail: francois.scheurer@xxxxxxxxxxxx
web: http://www.everyware.ch
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx