On Monday, July 29, 2013 07:23:35 PM Toralf Förster wrote: > On 07/29/2013 01:54 PM, Rafael J. Wysocki wrote: > > On Monday, July 29, 2013 04:52:39 PM Viresh Kumar wrote: > >> On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat > >> <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote: > >>> I'm assuming that the module_get() is used in the cpufreq core to ensure that > >>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend > >>> driver module (eg: acpi-cpufreq) can't be removed. > >> > >> I missed this simple stuff in my email.. :( > >> > >>> But the cpufreq_add_dev() function does a module *put* at the end of > >>> initializing a fresh CPU. > >>> > >>> 1057 kobject_uevent(&policy->kobj, KOBJ_ADD); > >>> 1058 module_put(cpufreq_driver->owner); > >>> 1059 pr_debug("initialization complete\n"); > >>> 1060 > >>> 1061 return 0; > >> > >> That actually looks wrong. And shoud be fixed. > > > > OK > > > >>> So, I wonder if it would be a good idea to instead allow that CPU to take a > >>> module refcount as well. That way, the problem that Toralf reported would go > >>> away, and at the same time, we can ensure that as long as the cpufreq core is > >>> managing CPUs, the cpufreq-backend-driver module refcount never drops to zero. > >>> > >>> Something like this: > >>> > >>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > >>> index a4ad733..ecfbc52 100644 > >>> --- a/drivers/cpufreq/cpufreq.c > >>> +++ b/drivers/cpufreq/cpufreq.c > >>> @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu, > >>> } > >>> write_unlock_irqrestore(&cpufreq_driver_lock, flags); > >>> > >>> + /* Bump up the refcount for this CPU */ > >>> + policy = cpufreq_cpu_get(cpu); > >>> + > >>> ret = cpufreq_add_dev_symlink(cpu, policy); > >>> - if (ret) > >>> + if (ret) { > >>> + cpufreq_cpu_put(policy); > >>> goto err_out_kobj_put; > >>> + } > >> > >> That will do an extra kobject_get() which we don't require.. Only removing what > >> I mentioned earlier should be good enough, I believe. > >> > >> Over that, I think below code in __cpufreq_governor() is also wrong. > >> > >> /* we keep one module reference alive for > >> each CPU governed by this CPU */ > >> if ((event != CPUFREQ_GOV_START) || ret) > >> module_put(policy->governor->owner); > >> if ((event == CPUFREQ_GOV_STOP) && !ret) > >> module_put(policy->governor->owner); > >> > >> The second if is sensible as it puts count when governor is stopped. > >> But the first one decrements the count when we failed for anything > >> other than START.. > >> > >> But this routine is called for other stuff too: > >> - INIT/Exit > >> - LIMITS, > >> > >> And so, count must be incremented for a passed INIT call and > >> decremented for a passed EXIT call, otherwise shouldn't be > >> touched. > > > > That sounds good, but I don't want to make those changes for 3.11 and at the > > same time I *do* want the reference count issue to go away. > > > > So the patch below is the one I'd like to apply for the time being and > > we can work on more improvements on top of that. > > > > Any objections? > > > > Toralf, please test this patch in the meantime. > > works - tested on top of 3.10.4 Great, thanks for the confirmation! Rafael > > --- > > From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume > > > > Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the > > driver module refcount, __cpufreq_remove_dev() causes that refcount > > to become negative for the cpufreq driver after a suspend/resume > > cycle. > > > > This is not the only bad thing that happens there, however, because > > kobject_put() should only be called for the policy kobject at this > > point if the CPU is not the last one for that policy. > > > > Namely, if the given CPU is the last one for that policy, the > > policy kobject's refcount should be 1 at this point, as set by > > cpufreq_add_dev_interface(), and only needs to be dropped once for > > the kobject to go away. This actually happens under the cpu == 1 > > check, so it need not be done before by cpufreq_cpu_put(). > > > > On the other hand, if the given CPU is not the last one for that > > policy, this means that cpufreq_add_policy_cpu() has been called > > at least once for that policy and cpufreq_cpu_get() has been > > called for it too. To balance that cpufreq_cpu_get(), we need to > > call cpufreq_cpu_put() in that case. > > > > Thus, to fix the described problem and keep the reference > > counters balanced in both cases, move the cpufreq_cpu_get() call > > in __cpufreq_remove_dev() to the code path executed only for > > CPUs that share the policy with other CPUs. > > > > Reported-by: Toralf Förster <toralf.foerster@xxxxxx> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > --- > > drivers/cpufreq/cpufreq.c | 19 ++++++++++--------- > > 1 file changed, 10 insertions(+), 9 deletions(-) > > > > Index: linux-pm/drivers/cpufreq/cpufreq.c > > =================================================================== > > --- linux-pm.orig/drivers/cpufreq/cpufreq.c > > +++ linux-pm/drivers/cpufreq/cpufreq.c > > @@ -1177,14 +1177,11 @@ static int __cpufreq_remove_dev(struct d > > __func__, cpu_dev->id, cpu); > > } > > > > - if ((cpus == 1) && (cpufreq_driver->target)) > > - __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT); > > - > > - pr_debug("%s: removing link, cpu: %d\n", __func__, cpu); > > - cpufreq_cpu_put(data); > > - > > /* If cpu is last user of policy, free policy */ > > if (cpus == 1) { > > + if (cpufreq_driver->target) > > + __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT); > > + > > lock_policy_rwsem_read(cpu); > > kobj = &data->kobj; > > cmp = &data->kobj_unregister; > > @@ -1205,9 +1202,13 @@ static int __cpufreq_remove_dev(struct d > > free_cpumask_var(data->related_cpus); > > free_cpumask_var(data->cpus); > > kfree(data); > > - } else if (cpufreq_driver->target) { > > - __cpufreq_governor(data, CPUFREQ_GOV_START); > > - __cpufreq_governor(data, CPUFREQ_GOV_LIMITS); > > + } else { > > + pr_debug("%s: removing link, cpu: %d\n", __func__, cpu); > > + cpufreq_cpu_put(data); > > + if (cpufreq_driver->target) { > > + __cpufreq_governor(data, CPUFREQ_GOV_START); > > + __cpufreq_governor(data, CPUFREQ_GOV_LIMITS); > > + } > > } > > > > per_cpu(cpufreq_policy_cpu, cpu) = -1; > > > > > > > -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html