Re: Discuss: New default recovery config settings

Milosz Tanski <milosz@xxxxxxxxx> · Fri, 29 May 2015 18:16:06 -0400

On Fri, May 29, 2015 at 5:47 PM, Samuel Just <sjust@xxxxxxxxxx> wrote:
> Many people have reported that they need to lower the osd recovery config options to minimize the impact of recovery on client io.  We are talking about changing the defaults as follows:
>
> osd_max_backfills to 1 (from 10)
> osd_recovery_max_active to 3 (from 15)
> osd_recovery_op_priority to 1 (from 10)
> osd_recovery_max_single_start to 1 (from 5)
>
> We'd like a bit of feedback first though.  Is anyone happy with the current configs?  Is anyone using something between these values and the current defaults?  What kind of workload?  I'd guess that lowering osd_max_backfills to 1 is probably a good idea, but I wonder whether lowering osd_recovery_max_active and osd_recovery_max_single_start will cause small objects to recover unacceptably slowly.
>
> Thoughts?
> -Sam

Sam I was thinking about this recently. We recently recently we ended
up hitting a recovery story & a scrub storm both happened at a time of
high client activity. While changing the defaults down will make these
kinds of disruptions less likely to occur, it also makes recovery
(rebalancing) very slow. What I'd like to see

What I would be happy to see is more of a QOS style tunable along the
lines of networking traffic shaping. Where can guarantee a minimum
amount of recovery "load" (and I say it in quotes since there's more
the one resource involved) when the cluster is busy with client IO. Or
vice versa there's a minimum amount of client IO that's guaranteed.
Then when there's lower periods of client activity the recovery (and
other background work) can proceed at full speed. Many workloads are
cyclical or seasonal (in the statistics term of it, eg. intra/infra
day seasonality).

QOS style managment should lead to a more dynamic system where we can
maximize available utilization, minimize disruptions, and not play
wack-a-mole with many conf knobs. I'm aware that this is much harder
to implement but thankfully there's a lot of literature,
implementation and practical experience out there to draw upon.

- Milosz

-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: milosz@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html