Re: block.db/block.wal device performance dropped after upgrade to 14.2.10

Vladimir Prokofev <v@xxxxxxxxxxx> · Tue, 4 Aug 2020 13:22:34 +0300

Here's some more insight into the issue.
Looks like the load is triggered because of a snaptrim operation. We have a
backup pool that serves as Openstack cinder-backup storage, performing
snapshot backups every night. Old backups are also deleted every night, so
snaptrim is initiated.
This snaptrim increased load on the block.db devices after upgrade, and
just kills one SSD's performance in particular. It serves as a block.db/wal
device for one of the fatter backup pool OSDs which has more PGs placed
there.
This is a Kingston SSD, and we see this issue on other Kingston SSD
journals too, Intel SSD journals are not that affected, though they too
experience increased load.
Nevertheless, there're now a lot of read IOPS on block.db devices after
upgrade that were not there before.
I wonder how 600 IOPS can destroy SSDs performance that hard.

вт, 4 авг. 2020 г. в 12:54, Vladimir Prokofev <v@xxxxxxxxxxx>:

> Good day, cephers!
>
> We've recently upgraded our cluster from 14.2.8 to 14.2.10 release, also
> performing full system packages upgrade(Ubuntu 18.04 LTS).
> After that performance significantly dropped, main reason beeing that
> journal SSDs are now have no merges, huge queues, and increased latency.
> There's a few screenshots in attachments. This is for an SSD journal that
> supports block.db/block.wal for 3 spinning OSDs, and it looks like this for
> all our SSD block.db/wal devices across all nodes.
> Any ideas what may cause that? Maybe I've missed something important in
> release notes?
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx