On Monday, July 29, 2013 08:57:20 PM Srivatsa S. Bhat wrote: > On 07/29/2013 05:34 PM, Rafael J. Wysocki wrote: > > On Monday, July 29, 2013 05:14:25 PM Srivatsa S. Bhat wrote: > >> On 07/29/2013 05:19 PM, Rafael J. Wysocki wrote: > >>> On Monday, July 29, 2013 03:11:45 PM Srivatsa S. Bhat wrote: > >>>> On 07/29/2013 04:50 AM, Rafael J. Wysocki wrote: > >>>>> On Monday, July 29, 2013 12:43:59 AM Rafael J. Wysocki wrote: > >>>>>> On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote: > >>>>>>> On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote: > >>>>>>>> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote: > >>>>>>>>> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote: > >>>>>>>>>> it gives at a ThinkPad T420: > >>>>>>>>>> > >>>>>>>>>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq > >>>>>>>>>> acpi_cpufreq 12902 2147483647 > >>>>>>>>> > >>>>>>>>> That is -1, which indicates some module refcount woes. > >>>>>>>> > >>>>>>>> yes, ofc. > >>>>>>>> > >>>>>>>> The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1. > >>>>>>>> > >>>>>>>>> I definitely can't see that with the mainline on my machines. > >>>>>>>> > >>>>>>>> It is in mainline too. > >>>>>>> > >>>>>>> Does the appended patch help? > >>>>>> > >>>>>> Actually, something as simple as this also should help: > >>>>>> > >>>>>> --- > >>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > >>>>>> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume > >>>>>> > >>>>>> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the > >>>>>> driver module refcount, __cpufreq_remove_dev() causes that refcount > >>>>>> to become negative after a suspend/resume cycle, for example. > >>>>>> > >>>>>> To prevent this from happening make __cpufreq_remove_dev() put > >>>>>> the policy kobject only instead of calling cpufreq_cpu_put(). > >>>>> > >>>>> Having a deeper look at it, though, I see that in fact the whole > >>>>> cpufreq_cpu_put() is needed if the CPU is not the last one for the given > >>>>> policy and is not needed at all otherwise (as described in the changelog > >>>>> of the patch below). > >>>>> > >>>>> Srivatsa, does this make sense to you? > >>>>> > >>>> > >>>> Code-wise, your patch looks good to me. But one thing in the existing code > >>>> struck me as a little strange. > >>>> > >>>> I'm assuming that the module_get() is used in the cpufreq core to ensure that > >>>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend > >>>> driver module (eg: acpi-cpufreq) can't be removed. > >>> > >>> Quite frankly, I'm not sure about that. If that were the case, > >>> cpufreq_add_dev() would not call module_put() which it does. > >>> > >>> That may be a bug, I agree, but that's not for the present release cycle. For > >>> now, we need to ensure that the reference counts are *balanced* . > >>> > >> > >> Sure, in that case, I agree that your patch is the right fix at this point, > >> since it resolves the immediate problem that we have with the refcounts. > >> > >> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> > > > > Great, thanks! > > > > Does your patchset avoiding the creation/removal of sysfs directories over > > suspend/resume need to be modified to take this patch into account? > > > > There will be some trivial conflicts, but looking a bit closer at my patchset, > I believe it has a subtle refcounting bug, relating to patches 6 and 7. > > Patch 6: http://marc.info/?l=linux-kernel&m=137358141628042&w=2 > Patch 7: http://marc.info/?l=linux-pm&m=137358124527993&w=2 > > > Patch 6 modifies cpufreq_add_policy_cpu() to return 0 if frozen == true. > But by that time, it would have already done a cpufreq_cpu_get(). > > In patch 7, under __cpufreq_remove_dev(), we avoid dropping refcounts > for *any* CPU when frozen == true. > > In the general case, this is fine (like myself and Viresh had discussed > earlier in that thread), because __cpufreq_add_dev() doesn't increment the > refcount in the frozen case. > > But we overlooked a rather subtle scenario: say, CPU 'X' went through > cpufreq_add_dev in the resume path, and restored its policy structure. > Since this was fresh, no refcount was taken. And then CPU 'Y' came along, > but it was related to CPU 'X', so it will take this path: > > 1018 /* Check if this cpu was hot-unplugged earlier and has siblings */ > 1019 read_lock_irqsave(&cpufreq_driver_lock, flags); > 1020 for_each_online_cpu(sibling) { > 1021 struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling); > 1022 if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) { > 1023 read_unlock_irqrestore(&cpufreq_driver_lock, flags); > 1024 return cpufreq_add_policy_cpu(cpu, sibling, dev, > 1025 frozen); > 1026 } > 1027 } > > This time, cp won't be NULL for CPU 'X', because its policy was restored > in the previous call. So we end up calling cpufreq_add_policy_cpu() for > CPU 'Y'. And that function takes a new refcount, as mentioned above. > > So we need to have this extra hunk in cpufreq_add_policy_cpu() to ensure > we don't take any additional refcounts during suspend/resume: > > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > index c7f59e8..9681c01 100644 > --- a/drivers/cpufreq/cpufreq.c > +++ b/drivers/cpufreq/cpufreq.c > @@ -932,8 +932,11 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling, > } > > /* Don't touch sysfs links during light-weight init */ > - if (frozen) > + if (frozen) { > + /* Drop the extra refcount that we took above */ > + cpufreq_cpu_put(policy); > return 0; > + } > > ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq"); > if (ret) > > > I can respin my entire patchset with this fix included, on top of your > patch. Please let me know if you'd prefer some other method for resolving > this. Please respin the patchset with the fix folded in. My patch is in the bleeding-edge branch of linux-pm.git, so please rebase the new version on that if that's not a problem. Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html