Re: Perf problem after upgrade from dumpling to firefly

Alexandre DERUMIER <aderumier@xxxxxxxxx> · Wed, 4 Mar 2015 14:44:18 +0100 (CET)

Hi,

maybe this is related ?:

http://tracker.ceph.com/issues/9503
"Dumpling: removing many snapshots in a short time makes OSDs go berserk"

http://tracker.ceph.com/issues/9487
"dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not helping"

http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-December/045116.html

I think it's already backport in dumpling, not sure it's already done for firefly

Alexandre

----- Mail original -----
De: "Olivier Bonvalet" <ceph.list@xxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Mercredi 4 Mars 2015 12:10:30
Objet:  Perf problem after upgrade from dumpling to firefly

Hi, 

last saturday I upgraded my production cluster from dumpling to emperor 
(since we were successfully using it on a test cluster). 
A couple of hours later, we had falling OSD : some of them were marked 
as down by Ceph, probably because of IO starvation. I marked the cluster 
in «noout», start downed OSD, then let him recover. 24h later, same 
problem (near same hour). 

So, I choose to directly upgrade to firefly, which is maintained. 
Things are better, but the cluster is slower than with dumpling. 

The main problem seems that OSD have twice more write operations par 
second : 
https://daevel.fr/img/firefly/firefly-upgrade-OSD70-IO.png 
https://daevel.fr/img/firefly/firefly-upgrade-OSD71-IO.png 

But journal doesn't change (SSD dedicated to OSD70+71+72) : 
https://daevel.fr/img/firefly/firefly-upgrade-OSD70+71-journal.png 

Neither node bandwidth : 
https://daevel.fr/img/firefly/firefly-upgrade-dragan-bandwidth.png 

Or whole cluster IO activity : 
https://daevel.fr/img/firefly/firefly-upgrade-cluster-IO.png 

Some background : 
The cluster is splitted in pools with «full SSD» OSD and «HDD+SSD 
journal» OSD. Only «HDD+SSD» OSD seems to be affected. 

I have 9 OSD on «HDD+SSD» node, 9 HDD and 3 SSD, and only 3 «HDD+SSD» 
nodes (so a total of 27 «HDD+SSD» OSD). 

The IO peak between 03h00 and 09h00 corresponds to snapshot rotation (= 
«rbd snap rm» operations). 
osd_snap_trim_sleep is setup to 0.8 since monthes. 
Yesterday I tried to reduce osd_pg_max_concurrent_snap_trims to 1. It 
doesn't seem to really help. 

The only thing which seems to help, is to reduce osd_disk_threads from 8 
to 1. 

So. Any idea about what's happening ? 

Thanks for any help, 
Olivier 

_______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com