Antw: Re: Performance after adding a node

"Steffen Weißgerber" <WeissgerberS@xxxxxxx> · Tue, 09 May 2017 09:44:08 +0200

Hi,

checking the actual value for osd_max_backfills at our cluster (0.94.9)
I also made a config diff of the osd configuration (ceph daemon osd.0
config diff) and wondered why there's a displayed default of 10 which
differs from the documented default at
http://docs.ceph.com/docs/master/rados/configuration/osd-config-ref/.

Did the default value changed since hammer?

Regards

Steffen

>>> David Turner <drakonstein@xxxxxxxxx> schrieb am Dienstag, 9. Mai 2017 um
00:03:
> WOW!!!  Those are some awfully high backfilling settings you have there.
> They are 100% the reason that your customers think your system is down.
> You're telling each OSD to be able to have 20 backfill operations running
> at the exact same time.  I bet if you were watching iostat -x 1 on one of
> your nodes before you inject those settings and then after you inject those
> settings, the disk usage will go from a decent amount of 40-70% and jump
> all the way up to 100% as soon as those settings are injected.
> 
> When you are backfilling, you are copying data from one drive to another.
> Each osd-max-backfill you set it to is another file it tries to copy at the
> same time.  These can be receiving data (writing to the disk) or moving
> data off (reading from the disk followed by a delete).  So by having 20
> backfills happening at a time, you are telling each disk to allow 20 files
> to be written and/or read from it at the same time.  What happens to a disk
> when you are copying 20 large files to it at a time?  all of them move
> slower (a lot to do with disk thrashing having 20 threads all reading and
> writing to different parts of the disk).
> 
> What you want to find is the point where your disks are usually around
> 80-90% utilized while backfilling, but not consistently 100%.  The easy way
> to do that is to increase your osd-max-backfills by 1 or 2 at a time until
> you see it go too high, and then back off.  I don't know many people that
> go above 5 max backfills in a production cluster on spinning disks.
> Usually the ones that do, do it temporarily while they know their cluster
> isn't being utilized by customers much.
> 
> Personally I have used osd-recover-threads ands osd-recover-max-active,
> I've been able to tune my clusters only using osd-max-backfills.  The lower
> you leave these the longer the backfill will take, but the less impact your
> customers will notice.  I've found 3 to be a generally safe number if
> customer IO is your priority, 5 works well if your customers can be ok with
> it being slow (but still usable)... but all of this depends on your
> hardware and software use-cases.  Test it while watching your disk
> utilizations and test your application while finding the right number for
> your environment.
> 
> Good Luck :)
> 
> On Mon, May 8, 2017 at 5:43 PM Daniel Davidson <danield@xxxxxxxxxxxxxxxx>
> wrote:
> 
>> Our ceph system performs very poorly or not even at all while the
>> remapping procedure is underway.  We are using replica 2 and the
>> following ceph tweaks while it is in process:
>>
>>   1013  ceph tell osd.* injectargs '--osd-recovery-max-active 20'
>>   1014  ceph tell osd.* injectargs '--osd-recovery-threads 20'
>>   1015  ceph tell osd.* injectargs '--osd-max-backfills 20'
>>   1016  ceph -w
>>   1017  ceph osd set noscrub
>>   1018  ceph osd set nodeep-scrub
>>
>> After the remapping finishes, we set these back to default.
>>
>> Are any of these causing our problems or is there another way to limit
>> the impact of the remapping so that users do not think the system is
>> down while we add more storage?
>>
>>
>> thanks,
>>
>> Dan
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx 
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>>

-- 
Klinik-Service Neubrandenburg GmbH
Allendestr. 30, 17036 Neubrandenburg
Amtsgericht Neubrandenburg, HRB 2457
Geschaeftsfuehrerin: Gudrun Kappich

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com