Are you sure it was because of configuration changes? Maybe it was restarting the OSDs that fixed it? We often hit an issue with backfill_toofull where the recovery/backfill processes get stuck until we restart the daemons (sometimes setting recovery_max_active helps as well). It still shows recovery of few objects now and then (few KB/s) and then stops completely. Jan > On 20 Aug 2015, at 17:43, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote: > >> >> Just to update the mailing list, we ended up going back to default >> ceph.conf without any additional settings than what is mandatory. We are >> now reaching speeds we never reached before, both in recovery and in >> regular usage. There was definitely something we set in the ceph.conf >> bogging everything down. > > Could you please share the old and new ceph.conf, or the section that > was removed? > > Best regards, > Alex > >> >> >> On 2015-08-20 4:06 AM, Christian Balzer wrote: >>> >>> Hello, >>> >>> from all the pertinent points by Somnath, the one about pre-conditioning >>> would be pretty high on my list, especially if this slowness persists and >>> nothing else (scrub) is going on. >>> >>> This might be "fixed" by doing a fstrim. >>> >>> Additionally the levelDB's per OSD are of course sync'ing heavily during >>> reconstruction, so that might not be the favorite thing for your type of >>> SSDs. >>> >>> But ultimately situational awareness is very important, as in "what" is >>> actually going and slowing things down. >>> As usual my recommendations would be to use atop, iostat or similar on all >>> your nodes and see if your OSD SSDs are indeed the bottleneck or if it is >>> maybe just one of them or something else entirely. >>> >>> Christian >>> >>> On Wed, 19 Aug 2015 20:54:11 +0000 Somnath Roy wrote: >>> >>>> Also, check if scrubbing started in the cluster or not. That may >>>> considerably slow down the cluster. >>>> >>>> -----Original Message----- >>>> From: Somnath Roy >>>> Sent: Wednesday, August 19, 2015 1:35 PM >>>> To: 'J-P Methot'; ceph-users@xxxxxxxx >>>> Subject: RE: Bad performances in recovery >>>> >>>> All the writes will go through the journal. >>>> It may happen your SSDs are not preconditioned well and after a lot of >>>> writes during recovery IOs are stabilized to lower number. This is quite >>>> common for SSDs if that is the case. >>>> >>>> Thanks & Regards >>>> Somnath >>>> >>>> -----Original Message----- >>>> From: J-P Methot [mailto:jpmethot@xxxxxxxxxx] >>>> Sent: Wednesday, August 19, 2015 1:03 PM >>>> To: Somnath Roy; ceph-users@xxxxxxxx >>>> Subject: Re: Bad performances in recovery >>>> >>>> Hi, >>>> >>>> Thank you for the quick reply. However, we do have those exact settings >>>> for recovery and it still strongly affects client io. I have looked at >>>> various ceph logs and osd logs and nothing is out of the ordinary. >>>> Here's an idea though, please tell me if I am wrong. >>>> >>>> We use intel SSDs for journaling and samsung SSDs as proper OSDs. As was >>>> explained several times on this mailing list, Samsung SSDs suck in ceph. >>>> They have horrible O_dsync speed and die easily, when used as journal. >>>> That's why we're using Intel ssds for journaling, so that we didn't end >>>> up putting 96 samsung SSDs in the trash. >>>> >>>> In recovery though, what is the ceph behaviour? What kind of write does >>>> it do on the OSD SSDs? Does it write directly to the SSDs or through the >>>> journal? >>>> >>>> Additionally, something else we notice: the ceph cluster is MUCH slower >>>> after recovery than before. Clearly there is a bottleneck somewhere and >>>> that bottleneck does not get cleared up after the recovery is done. >>>> >>>> >>>> On 2015-08-19 3:32 PM, Somnath Roy wrote: >>>>> If you are concerned about *client io performance* during recovery, >>>>> use these settings.. >>>>> >>>>> osd recovery max active = 1 >>>>> osd max backfills = 1 >>>>> osd recovery threads = 1 >>>>> osd recovery op priority = 1 >>>>> >>>>> If you are concerned about *recovery performance*, you may want to >>>>> bump this up, but I doubt it will help much from default settings.. >>>>> >>>>> Thanks & Regards >>>>> Somnath >>>>> >>>>> -----Original Message----- >>>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf >>>>> Of J-P Methot >>>>> Sent: Wednesday, August 19, 2015 12:17 PM >>>>> To: ceph-users@xxxxxxxx >>>>> Subject: Bad performances in recovery >>>>> >>>>> Hi, >>>>> >>>>> Our setup is currently comprised of 5 OSD nodes with 12 OSD each, for >>>>> a total of 60 OSDs. All of these are SSDs with 4 SSD journals on each. >>>>> The ceph version is hammer v0.94.1 . There is a performance overhead >>>>> because we're using SSDs (I've heard it gets better in infernalis, but >>>>> we're not upgrading just yet) but we can reach numbers that I would >>>>> consider "alright". >>>>> >>>>> Now, the issue is, when the cluster goes into recovery it's very fast >>>>> at first, but then slows down to ridiculous levels as it moves >>>>> forward. You can go from 7% to 2% to recover in ten minutes, but it >>>>> may take 2 hours to recover the last 2%. While this happens, the >>>>> attached openstack setup becomes incredibly slow, even though there is >>>>> only a small fraction of objects still recovering (less than 1%). The >>>>> settings that may affect recovery speed are very low, as they are by >>>>> default, yet they still affect client io speed way more than it should. >>>>> >>>>> Why would ceph recovery become so slow as it progress and affect >>>>> client io even though it's recovering at a snail's pace? And by a >>>>> snail's pace, I mean a few kb/second on 10gbps uplinks. -- >>>>> ====================== Jean-Philippe Méthot >>>>> Administrateur système / System administrator GloboTech Communications >>>>> Phone: 1-514-907-0050 >>>>> Toll Free: 1-(888)-GTCOMM1 >>>>> Fax: 1-(514)-907-0750 >>>>> jpmethot@xxxxxxxxxx >>>>> http://www.gtcomm.net >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@xxxxxxxxxxxxxx >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>>> ________________________________ >>>>> >>>>> PLEASE NOTE: The information contained in this electronic mail message >>>>> is intended only for the use of the designated recipient(s) named >>>>> above. If the reader of this message is not the intended recipient, >>>>> you are hereby notified that you have received this message in error >>>>> and that any review, dissemination, distribution, or copying of this >>>>> message is strictly prohibited. If you have received this >>>>> communication in error, please notify the sender by telephone or >>>>> e-mail (as shown above) immediately and destroy any and all copies of >>>>> this message in your possession (whether hard copies or electronically >>>>> stored copies). >>>>> >>>> >>>> >>>> -- >>>> ====================== >>>> Jean-Philippe Méthot >>>> Administrateur système / System administrator GloboTech Communications >>>> Phone: 1-514-907-0050 >>>> Toll Free: 1-(888)-GTCOMM1 >>>> Fax: 1-(514)-907-0750 >>>> jpmethot@xxxxxxxxxx >>>> http://www.gtcomm.net >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> >> -- >> ====================== >> Jean-Philippe Méthot >> Administrateur système / System administrator >> GloboTech Communications >> Phone: 1-514-907-0050 >> Toll Free: 1-(888)-GTCOMM1 >> Fax: 1-(514)-907-0750 >> jpmethot@xxxxxxxxxx >> http://www.gtcomm.net >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com