Re: NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule, round 2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20 May 2013 18:53, Borislav Petkov <bp@xxxxxxxxx> wrote:
> I just confirmed that policy->cpus contains offlined cores with this:
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad82d23..e8c25f71e9b6 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -169,6 +169,9 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>  {
>         struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>
> +       if (WARN_ON(!cpu_online(cpu)))
> +               return;
> +
>         mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
>  }

Hmm, so for sure there is some locking issue there.
Have you tried my patch? I am not sure if it will fix everything but may
fix it.

> see splats collection below.
>
> And I don't think your fix above addresses the issue for the simple
> reason that if cpus go offline *before* you do get_online_cpus(), then
> policy->cpus will already contain offlined cpus.
>
> Rather, a better fix would be, IMHO, to do this (it works here, of course):
>
> ---
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad82d23..58541b164494 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -17,6 +17,7 @@
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
>  #include <asm/cputime.h>
> +#include <linux/cpu.h>
>  #include <linux/cpufreq.h>
>  #include <linux/cpumask.h>
>  #include <linux/export.h>
> @@ -169,7 +170,15 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
>  {
>         struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>
> +       get_online_cpus();
> +
> +       if (!cpu_online(cpu))
> +               goto out;
> +
>         mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
> +
> + out:
> +       put_online_cpus();
>  }
>
>  void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,

This looks fine, but I want to fix the locking rather than just
hiding the issue. :)
--
To unsubscribe from this list: send the line "unsubscribe cpufreq" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Devel]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Forum]     [Linux SCSI]

  Powered by Linux