All,
Whenever we're doing some kind of recovery operation on our ceph
clusters (cluster expansion or dealing with a drive failure), there
seems to be a fairly noticable performance drop while it does the
backfills (last time I measured it the performance during recovery was
something like 20% of a healthy cluster). I'm wondering if there are
any settings that we might be missing which would improve this
situation?
Before doing any kind of expansion operation I make sure both 'noscrub'
and 'nodeep-scrub' are set to make sure scrubing isn't making things
worse.
Also we have the following options set in our ceph.conf:
[osd]
osd_journal_size = 16384
osd_max_backfills = 1
osd_recovery_max_active = 1
osd_recovery_op_priority = 1
osd_recovery_max_single_start = 1
osd_op_threads = 12
osd_crush_initial_weight = 0
I'm wondering if there might be a way to use ionice in the CFQ scheduler
to delegate the recovery traffic to be of the Idle type so customer
traffic has a higher priority?
Thanks,
Bryan
This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout. |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com