> > Just to update the mailing list, we ended up going back to default > ceph.conf without any additional settings than what is mandatory. We are > now reaching speeds we never reached before, both in recovery and in > regular usage. There was definitely something we set in the ceph.conf > bogging everything down. Could you please share the old and new ceph.conf, or the section that was removed? Best regards, Alex > > > On 2015-08-20 4:06 AM, Christian Balzer wrote: >> >> Hello, >> >> from all the pertinent points by Somnath, the one about pre-conditioning >> would be pretty high on my list, especially if this slowness persists and >> nothing else (scrub) is going on. >> >> This might be "fixed" by doing a fstrim. >> >> Additionally the levelDB's per OSD are of course sync'ing heavily during >> reconstruction, so that might not be the favorite thing for your type of >> SSDs. >> >> But ultimately situational awareness is very important, as in "what" is >> actually going and slowing things down. >> As usual my recommendations would be to use atop, iostat or similar on all >> your nodes and see if your OSD SSDs are indeed the bottleneck or if it is >> maybe just one of them or something else entirely. >> >> Christian >> >> On Wed, 19 Aug 2015 20:54:11 +0000 Somnath Roy wrote: >> >>> Also, check if scrubbing started in the cluster or not. That may >>> considerably slow down the cluster. >>> >>> -----Original Message----- >>> From: Somnath Roy >>> Sent: Wednesday, August 19, 2015 1:35 PM >>> To: 'J-P Methot'; ceph-users@xxxxxxxx >>> Subject: RE: Bad performances in recovery >>> >>> All the writes will go through the journal. >>> It may happen your SSDs are not preconditioned well and after a lot of >>> writes during recovery IOs are stabilized to lower number. This is quite >>> common for SSDs if that is the case. >>> >>> Thanks & Regards >>> Somnath >>> >>> -----Original Message----- >>> From: J-P Methot [mailto:jpmethot@xxxxxxxxxx] >>> Sent: Wednesday, August 19, 2015 1:03 PM >>> To: Somnath Roy; ceph-users@xxxxxxxx >>> Subject: Re: Bad performances in recovery >>> >>> Hi, >>> >>> Thank you for the quick reply. However, we do have those exact settings >>> for recovery and it still strongly affects client io. I have looked at >>> various ceph logs and osd logs and nothing is out of the ordinary. >>> Here's an idea though, please tell me if I am wrong. >>> >>> We use intel SSDs for journaling and samsung SSDs as proper OSDs. As was >>> explained several times on this mailing list, Samsung SSDs suck in ceph. >>> They have horrible O_dsync speed and die easily, when used as journal. >>> That's why we're using Intel ssds for journaling, so that we didn't end >>> up putting 96 samsung SSDs in the trash. >>> >>> In recovery though, what is the ceph behaviour? What kind of write does >>> it do on the OSD SSDs? Does it write directly to the SSDs or through the >>> journal? >>> >>> Additionally, something else we notice: the ceph cluster is MUCH slower >>> after recovery than before. Clearly there is a bottleneck somewhere and >>> that bottleneck does not get cleared up after the recovery is done. >>> >>> >>> On 2015-08-19 3:32 PM, Somnath Roy wrote: >>>> If you are concerned about *client io performance* during recovery, >>>> use these settings.. >>>> >>>> osd recovery max active = 1 >>>> osd max backfills = 1 >>>> osd recovery threads = 1 >>>> osd recovery op priority = 1 >>>> >>>> If you are concerned about *recovery performance*, you may want to >>>> bump this up, but I doubt it will help much from default settings.. >>>> >>>> Thanks & Regards >>>> Somnath >>>> >>>> -----Original Message----- >>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf >>>> Of J-P Methot >>>> Sent: Wednesday, August 19, 2015 12:17 PM >>>> To: ceph-users@xxxxxxxx >>>> Subject: Bad performances in recovery >>>> >>>> Hi, >>>> >>>> Our setup is currently comprised of 5 OSD nodes with 12 OSD each, for >>>> a total of 60 OSDs. All of these are SSDs with 4 SSD journals on each. >>>> The ceph version is hammer v0.94.1 . There is a performance overhead >>>> because we're using SSDs (I've heard it gets better in infernalis, but >>>> we're not upgrading just yet) but we can reach numbers that I would >>>> consider "alright". >>>> >>>> Now, the issue is, when the cluster goes into recovery it's very fast >>>> at first, but then slows down to ridiculous levels as it moves >>>> forward. You can go from 7% to 2% to recover in ten minutes, but it >>>> may take 2 hours to recover the last 2%. While this happens, the >>>> attached openstack setup becomes incredibly slow, even though there is >>>> only a small fraction of objects still recovering (less than 1%). The >>>> settings that may affect recovery speed are very low, as they are by >>>> default, yet they still affect client io speed way more than it should. >>>> >>>> Why would ceph recovery become so slow as it progress and affect >>>> client io even though it's recovering at a snail's pace? And by a >>>> snail's pace, I mean a few kb/second on 10gbps uplinks. -- >>>> ====================== Jean-Philippe Méthot >>>> Administrateur système / System administrator GloboTech Communications >>>> Phone: 1-514-907-0050 >>>> Toll Free: 1-(888)-GTCOMM1 >>>> Fax: 1-(514)-907-0750 >>>> jpmethot@xxxxxxxxxx >>>> http://www.gtcomm.net >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> ________________________________ >>>> >>>> PLEASE NOTE: The information contained in this electronic mail message >>>> is intended only for the use of the designated recipient(s) named >>>> above. If the reader of this message is not the intended recipient, >>>> you are hereby notified that you have received this message in error >>>> and that any review, dissemination, distribution, or copying of this >>>> message is strictly prohibited. If you have received this >>>> communication in error, please notify the sender by telephone or >>>> e-mail (as shown above) immediately and destroy any and all copies of >>>> this message in your possession (whether hard copies or electronically >>>> stored copies). >>>> >>> >>> >>> -- >>> ====================== >>> Jean-Philippe Méthot >>> Administrateur système / System administrator GloboTech Communications >>> Phone: 1-514-907-0050 >>> Toll Free: 1-(888)-GTCOMM1 >>> Fax: 1-(514)-907-0750 >>> jpmethot@xxxxxxxxxx >>> http://www.gtcomm.net >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > > > -- > ====================== > Jean-Philippe Méthot > Administrateur système / System administrator > GloboTech Communications > Phone: 1-514-907-0050 > Toll Free: 1-(888)-GTCOMM1 > Fax: 1-(514)-907-0750 > jpmethot@xxxxxxxxxx > http://www.gtcomm.net > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com