On 07/29/2013 05:24 PM, Rafael J. Wysocki wrote: > On Monday, July 29, 2013 04:52:39 PM Viresh Kumar wrote: >> On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat >> <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote: >>> I'm assuming that the module_get() is used in the cpufreq core to ensure that >>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend >>> driver module (eg: acpi-cpufreq) can't be removed. >> >> I missed this simple stuff in my email.. :( >> >>> But the cpufreq_add_dev() function does a module *put* at the end of >>> initializing a fresh CPU. >>> >>> 1057 kobject_uevent(&policy->kobj, KOBJ_ADD); >>> 1058 module_put(cpufreq_driver->owner); >>> 1059 pr_debug("initialization complete\n"); >>> 1060 >>> 1061 return 0; >> >> That actually looks wrong. And shoud be fixed. > > OK > >>> So, I wonder if it would be a good idea to instead allow that CPU to take a >>> module refcount as well. That way, the problem that Toralf reported would go >>> away, and at the same time, we can ensure that as long as the cpufreq core is >>> managing CPUs, the cpufreq-backend-driver module refcount never drops to zero. >>> >>> Something like this: >>> >>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c >>> index a4ad733..ecfbc52 100644 >>> --- a/drivers/cpufreq/cpufreq.c >>> +++ b/drivers/cpufreq/cpufreq.c >>> @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu, >>> } >>> write_unlock_irqrestore(&cpufreq_driver_lock, flags); >>> >>> + /* Bump up the refcount for this CPU */ >>> + policy = cpufreq_cpu_get(cpu); >>> + >>> ret = cpufreq_add_dev_symlink(cpu, policy); >>> - if (ret) >>> + if (ret) { >>> + cpufreq_cpu_put(policy); >>> goto err_out_kobj_put; >>> + } >> >> That will do an extra kobject_get() which we don't require.. Only removing what >> I mentioned earlier should be good enough, I believe. >> >> Over that, I think below code in __cpufreq_governor() is also wrong. >> >> /* we keep one module reference alive for >> each CPU governed by this CPU */ >> if ((event != CPUFREQ_GOV_START) || ret) >> module_put(policy->governor->owner); >> if ((event == CPUFREQ_GOV_STOP) && !ret) >> module_put(policy->governor->owner); >> >> The second if is sensible as it puts count when governor is stopped. >> But the first one decrements the count when we failed for anything >> other than START.. >> >> But this routine is called for other stuff too: >> - INIT/Exit >> - LIMITS, >> >> And so, count must be incremented for a passed INIT call and >> decremented for a passed EXIT call, otherwise shouldn't be >> touched. > > That sounds good, but I don't want to make those changes for 3.11 and at the > same time I *do* want the reference count issue to go away. > > So the patch below is the one I'd like to apply for the time being and > we can work on more improvements on top of that. > > Any objections? > > Toralf, please test this patch in the meantime. > > Rafael > > > --- > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume > > Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the > driver module refcount, __cpufreq_remove_dev() causes that refcount > to become negative for the cpufreq driver after a suspend/resume > cycle. > > This is not the only bad thing that happens there, however, because > kobject_put() should only be called for the policy kobject at this > point if the CPU is not the last one for that policy. > > Namely, if the given CPU is the last one for that policy, the > policy kobject's refcount should be 1 at this point, as set by > cpufreq_add_dev_interface(), and only needs to be dropped once for > the kobject to go away. This actually happens under the cpu == 1 > check, so it need not be done before by cpufreq_cpu_put(). > > On the other hand, if the given CPU is not the last one for that > policy, this means that cpufreq_add_policy_cpu() has been called > at least once for that policy and cpufreq_cpu_get() has been > called for it too. To balance that cpufreq_cpu_get(), we need to > call cpufreq_cpu_put() in that case. > > Thus, to fix the described problem and keep the reference > counters balanced in both cases, move the cpufreq_cpu_get() call > in __cpufreq_remove_dev() to the code path executed only for > CPUs that share the policy with other CPUs. > > Reported-by: Toralf Förster <toralf.foerster@xxxxxx> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> This version looks good as well. Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> Regards, Srivatsa S. Bhat > --- > drivers/cpufreq/cpufreq.c | 19 ++++++++++--------- > 1 file changed, 10 insertions(+), 9 deletions(-) > > Index: linux-pm/drivers/cpufreq/cpufreq.c > =================================================================== > --- linux-pm.orig/drivers/cpufreq/cpufreq.c > +++ linux-pm/drivers/cpufreq/cpufreq.c > @@ -1177,14 +1177,11 @@ static int __cpufreq_remove_dev(struct d > __func__, cpu_dev->id, cpu); > } > > - if ((cpus == 1) && (cpufreq_driver->target)) > - __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT); > - > - pr_debug("%s: removing link, cpu: %d\n", __func__, cpu); > - cpufreq_cpu_put(data); > - > /* If cpu is last user of policy, free policy */ > if (cpus == 1) { > + if (cpufreq_driver->target) > + __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT); > + > lock_policy_rwsem_read(cpu); > kobj = &data->kobj; > cmp = &data->kobj_unregister; > @@ -1205,9 +1202,13 @@ static int __cpufreq_remove_dev(struct d > free_cpumask_var(data->related_cpus); > free_cpumask_var(data->cpus); > kfree(data); > - } else if (cpufreq_driver->target) { > - __cpufreq_governor(data, CPUFREQ_GOV_START); > - __cpufreq_governor(data, CPUFREQ_GOV_LIMITS); > + } else { > + pr_debug("%s: removing link, cpu: %d\n", __func__, cpu); > + cpufreq_cpu_put(data); > + if (cpufreq_driver->target) { > + __cpufreq_governor(data, CPUFREQ_GOV_START); > + __cpufreq_governor(data, CPUFREQ_GOV_LIMITS); > + } > } > > per_cpu(cpufreq_policy_cpu, cpu) = -1; > -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html