Re: Discuss: New default recovery config settings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I like the idea of turning the defaults down.  During the ceph operators session at the OpenStack conference last week Warren described the behavior pretty accurately as "Ceph basically DOSes itself unless you reduce those settings."  Maybe this is more of a problem when the clusters are small?

Another idea would be to have a better way to prioritize recovery traffic to an even lower priority level by setting the ionice value to 'Idle' in the CFQ scheduler?

Bryan

From: Josef Johansson <josef86@xxxxxxxxx>
Date: Friday, May 29, 2015 at 4:16 PM
To: Samuel Just <sjust@xxxxxxxxxx>, ceph-devel <ceph-devel@xxxxxxxxxxxxxxx>, "'ceph-users@xxxxxxxxxxxxxx' (ceph-users@xxxxxxxxxxxxxx)" <ceph-users@xxxxxxxxxxxxxx>
Subject: Re: Discuss: New default recovery config settings


Hi,

We did it the other way around instead, defining a period where the load is lighter and turn off/on backfill/recover. Then you want the backfill values to be the what is default right now.

Also, someone said that (think it was Greg?) If you have problems with backfill, your cluster backing store is not fast enough/too much load.
If 10 osds goes down at the same time you want those values to be high to minimize the downtime.

/Josef


fre 29 maj 2015 23:47 Samuel Just <sjust@xxxxxxxxxx> skrev:
Many people have reported that they need to lower the osd recovery config options to minimize the impact of recovery on client io.  We are talking about changing the defaults as follows:

osd_max_backfills to 1 (from 10)
osd_recovery_max_active to 3 (from 15)
osd_recovery_op_priority to 1 (from 10)
osd_recovery_max_single_start to 1 (from 5)

We'd like a bit of feedback first though.  Is anyone happy with the current configs?  Is anyone using something between these values and the current defaults?  What kind of workload?  I'd guess that lowering osd_max_backfills to 1 is probably a good idea, but I wonder whether lowering osd_recovery_max_active and osd_recovery_max_single_start will cause small objects to recover unacceptably slowly.

Thoughts?
-Sam
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux