On 12/16/2013 04:32 PM, Bjørn Mork wrote: > Viresh Kumar <viresh.kumar@xxxxxxxxxx> writes: >> On 8 December 2013 15:54, Zdenek Kabelac <zkabelac@xxxxxxxxxx> wrote: >>> I'm testing recent 3.13-rc kernel - and I've notice my Lenovo T61 is not >>> able to resume. After bisect I've found commit: >>> >>> 5a87182aa21d6d5d306840feab9321818dd3e2a3 >> >> That's my patch, sorry for all the trouble.. Just came back after vacations >> and multiple threads are there with this patch as culprit.. >> >> I just tested this patch again on Rafael's linux-next branch and suspend/resume >> and Hibernation are working fine for me on my Thinkpad T420. I don't see any >> issues at all.. >> >> scaling_governor: ondemand >> scaling_driver: acpi-cpufreq >> >>> When I just revert this commit with 374b105797c3d4f29c685f3be535c35f5689b30e >>> (3.13-rc3) suspend/resume works again. >>> (I'm using ondemand CPU governor) >> >>> Here is bisect log: >> >>> # bad: [2167e2399dc5e69c62db56d933e9c8cbe107620a] cpufreq: fix garbage >>> kobjects on errors during suspend/resume >>> git bisect bad 2167e2399dc5e69c62db56d933e9c8cbe107620a >> >> I don't read bisect logs well but doesn't you log say that the culprit is >> 2167e2399dc5e69c62db56d933e9c8cbe107620a instead? >> >> I am trying with this patch now, but will take some time for results to be out. > > It's both patches in combination. Interesting case, really :-) > True... Basically, what happens is that commit 5a87182 prevents POLICY_EXIT of governors during suspend - ie., it just does a CPUFREQ_GOV_STOP and prevents any further operations on governors. Later, commit 2167e23 removes the special suspend/resume handling for CPU hotplug and thus in this path, CPU offline sends out POLICY_EXIT and also frees up the memory allocated to the policy structure. Since POLICY_EXIT is prevented by the cpufreq_suspend() code, this tear-down only gets half-completed and thus goes into an inconsistent state. So, upon resume, the CPU online operation tries POLICY_INIT, GOV_START etc, and again none of these get executed because of the cpufreq_suspend() code. Eventually cpufreq_resume() will execute GOV_START, but by then the underlying policy-structure is a newly allocated one, and things have already been messed up so much that it just can't resume properly. Overall, the inconsistency in handling governor code during suspend/resume is the root-cause of the suspend/resume regression. And as Bjorn mentioned, this is caused by the interaction of the 2 commits with one another, and cannot be reproduced by either of the commits independently. IIUC, mainline should be fine now, since Rafael reverted both the commits. We need to be more careful in bringing back either functionality in the future. As Bjorn and Rafael rightly pointed out in the other email threads, we need to fundamentally rethink the design and come up with elegant interfaces/mechanisms in the cpufreq subsystem that will allow us to write code that works as expected and avoid such subtle regressions. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html