Re: [tip:sched/core] sched: Re-tune the scheduler latency defaults to decrease worst-case latencies

Martin Steigerwald <Martin@xxxxxxxxxxxx> · Sat, 12 Sep 2009 13:45:28 +0200

Am Mittwoch 09 September 2009 schrieb tip-bot for Mike Galbraith:
> Commit-ID:  172e082a9111ea504ee34cbba26284a5ebdc53a7
> Gitweb:    
>  http://git.kernel.org/tip/172e082a9111ea504ee34cbba26284a5ebdc53a7
>  Author:     Mike Galbraith <efault@xxxxxx>
> AuthorDate: Wed, 9 Sep 2009 15:41:37 +0200
> Committer:  Ingo Molnar <mingo@xxxxxxx>
> CommitDate: Wed, 9 Sep 2009 17:30:06 +0200
> 
> sched: Re-tune the scheduler latency defaults to decrease worst-case
>  latencies
> 
> Reduce the latency target from 20 msecs to 5 msecs.
> 
> Why? Larger latencies increase spread, which is good for scaling,
> but bad for worst case latency.
> 
> We still have the ilog(nr_cpus) rule to scale up on bigger
> server boxes.
> 
> Signed-off-by: Mike Galbraith <efault@xxxxxx>
> Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> LKML-Reference: <1252486344.28645.18.camel@xxxxxxxxxxxxxxxx>
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
> 
> 
> ---
>  kernel/sched_fair.c |   12 ++++++------
>  1 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index af325a3..26fadb4 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -24,7 +24,7 @@
> 
>  /*
>   * Targeted preemption latency for CPU-bound tasks:
> - * (default: 20ms * (1 + ilog(ncpus)), units: nanoseconds)
> + * (default: 5ms * (1 + ilog(ncpus)), units: nanoseconds)
>   *
>   * NOTE: this latency value is not the same as the concept of
>   * 'timeslice length' - timeslices in CFS are of variable length
> @@ -34,13 +34,13 @@
>   * (to see the precise effective timeslice length of your workload,
>   *  run vmstat and monitor the context-switches (cs) field)
>   */
> -unsigned int sysctl_sched_latency = 20000000ULL;
> +unsigned int sysctl_sched_latency = 5000000ULL;
> 
>  /*
>   * Minimal preemption granularity for CPU-bound tasks:
> - * (default: 4 msec * (1 + ilog(ncpus)), units: nanoseconds)
> + * (default: 1 msec * (1 + ilog(ncpus)), units: nanoseconds)
>   */
> -unsigned int sysctl_sched_min_granularity = 4000000ULL;
> +unsigned int sysctl_sched_min_granularity = 1000000ULL;

Needs to be lower for a fluid desktop experience here:

shambhala:/proc/sys/kernel> cat sched_min_granularity_ns
100000

> 
>  /*
>   * is kept at sysctl_sched_latency / sysctl_sched_min_granularity
> @@ -63,13 +63,13 @@ unsigned int __read_mostly
>  sysctl_sched_compat_yield;
> 
>  /*
>   * SCHED_OTHER wake-up granularity.
> - * (default: 5 msec * (1 + ilog(ncpus)), units: nanoseconds)
> + * (default: 1 msec * (1 + ilog(ncpus)), units: nanoseconds)
>   *
>   * This option delays the preemption effects of decoupled workloads
>   * and reduces their over-scheduling. Synchronous workloads will still
>   * have immediate wakeup/sleep latencies.
>   */
> -unsigned int sysctl_sched_wakeup_granularity = 5000000UL;
> +unsigned int sysctl_sched_wakeup_granularity = 1000000UL;

Dito:

shambhala:/proc/sys/kernel> cat sched_wakeup_granularity_ns
100000

With

shambhala:~> cat /proc/version
Linux version 2.6.31-rc7-tp42-toi-3.0.1-04741-g57e61c0 (martin@shambhala) 
(gcc version 4.3.3 (Debian 4.3.3-10) ) #6 PREEMPT Sun Aug 23 10:51:32 CEST 
2009

on my ThinkPad T42.

Otherwise compositing animations like switching desktops and zooming in 
newly opening windows still appear jerky. Even with:

shambhala:/sys/kernel/debug> cat sched_features
NO_NEW_FAIR_SLEEPERS NO_NORMALIZED_SLEEPER ADAPTIVE_GRAN WAKEUP_PREEMPT 
START_DEBIT AFFINE_WAKEUPS CACHE_HOT_BUDDY SYNC_WAKEUPS NO_HRTICK 
NO_DOUBLE_TICK ASYM_GRAN LB_BIAS LB_WAKEUP_UPDATE ASYM_EFF_LOAD 
NO_WAKEUP_OVERLAP LAST_BUDDY OWNER_SPIN

But NO_NEW_FAIR_SLEEPERS also gives a benefit. It makes those animation 
even more fluent.

In complete I am quity happy with

shambhala:/proc/sys/kernel> grep "" *sched*
sched_child_runs_first:0
sched_compat_yield:0
sched_features:113916
sched_latency_ns:5000000
sched_migration_cost:500000
sched_min_granularity_ns:100000
sched_nr_migrate:32
sched_rt_period_us:1000000
sched_rt_runtime_us:950000
sched_shares_ratelimit:250000
sched_shares_thresh:4
sched_wakeup_granularity_ns:100000

for now.

It really makes a *lot* of difference. But it appears that both 
sched_min_granularity_ns and sched_wakeup_granularity_ns have to be lower 
on my ThinkPad for best effect.

I would still prefer some autotuning, where I say "desktop!" or nothing at 
all. And thats it.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7