Perf problem after upgrade from dumpling to firefly

Olivier Bonvalet <ceph.list@xxxxxxxxx> · Wed, 04 Mar 2015 12:10:30 +0100

Hi,

last saturday I upgraded my production cluster from dumpling to emperor
(since we were successfully using it on a test cluster).
A couple of hours later, we had falling OSD : some of them were marked
as down by Ceph, probably because of IO starvation. I marked the cluster
in «noout», start downed OSD, then let him recover. 24h later, same
problem (near same hour).

So, I choose to directly upgrade to firefly, which is maintained.
Things are better, but the cluster is slower than with dumpling.

The main problem seems that OSD have twice more write operations par
second :
https://daevel.fr/img/firefly/firefly-upgrade-OSD70-IO.png
https://daevel.fr/img/firefly/firefly-upgrade-OSD71-IO.png

But journal doesn't change (SSD dedicated to OSD70+71+72) :
https://daevel.fr/img/firefly/firefly-upgrade-OSD70+71-journal.png

Neither node bandwidth :
https://daevel.fr/img/firefly/firefly-upgrade-dragan-bandwidth.png

Or whole cluster IO activity :
https://daevel.fr/img/firefly/firefly-upgrade-cluster-IO.png

Some background :
The cluster is splitted in pools with «full SSD» OSD and «HDD+SSD
journal» OSD. Only «HDD+SSD» OSD seems to be affected.

I have 9 OSD on «HDD+SSD» node, 9 HDD and 3 SSD, and only 3 «HDD+SSD»
nodes (so a total of 27 «HDD+SSD» OSD).

The IO peak between 03h00 and 09h00 corresponds to snapshot rotation (=
«rbd snap rm» operations).
osd_snap_trim_sleep is setup to 0.8 since monthes.
Yesterday I tried to reduce osd_pg_max_concurrent_snap_trims to 1. It
doesn't seem to really help.

The only thing which seems to help, is to reduce osd_disk_threads from 8
to 1.

So. Any idea about what's happening ?

Thanks for any help,
Olivier

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com