On 1 August 2013 13:41, Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote: > On 08/01/2013 05:38 AM, Rafael J. Wysocki wrote: >> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> >> >> The cpufreq core is a little inconsistent in the way it uses the >> driver module refcount. >> >> Namely, if __cpufreq_add_dev() is called for a CPU without siblings >> or generally a CPU for which a new policy object has to be created, >> it grabs a reference to the driver module to start with, but drops >> that reference before returning. As a result, the driver module >> refcount is then equal to 0 after __cpufreq_add_dev() has returned. >> >> On the other hand, if the given CPU is a sibling of some other >> CPU already having a policy, cpufreq_add_policy_cpu() is called >> to link the new CPU to the existing policy. In that case, >> cpufreq_cpu_get() is called to obtain that policy and grabs a >> reference to the driver module, but that reference is not >> released and the module refcount will be different from 0 after >> __cpufreq_add_dev() returns (unless there is an error). That >> prevents the driver module from being unloaded until >> __cpufreq_remove_dev() is called for all the CPUs that >> cpufreq_add_policy_cpu() was called for previously. >> >> To remove that inconsistency make cpufreq_add_policy_cpu() execute >> cpufreq_cpu_put() for the given policy before returning, which >> decrements the driver module refcount so that it will be 0 after >> __cpufreq_add_dev() returns, > > Removing the inconsistency is a good thing, but I think we should > make it consistent the other way around - make a CPU-online increment > the driver module refcount and decrement it only on CPU-offline. I took time to review to this mail as I was looking at the problem yesterday. I am sorry to say, but I have completely different views as compared to You and Rafael both :) First of all, Rafael's patch is incomplete as it hasn't fixed the issue completely. When we have multiple CPUs per policy and cpufreq_add_dev() is called for the first one, it call cpufreq_get_cpu() for all cpus of this policy(), so count is == x (no. of CPUs in this policy) + 1 (This is the initial value of .owner). And so we still have module count getting incremented for other cpus :) Now few lines about My point of view to this whole thing. I believe we should get rid of .owner field from struct cpufreq_driver completely. It doesn't make sense to me in doing all this management at all. Surprised? Shocked? Laughing at me? :) Well I may be wrong but this is what I think: - It looks stupid to me that I can't do this from userspace in one go: $ insmod cpufreq_driver.ko $ rmmod cpufreq_driver.ko What the hell changed in between that isn't visible to user? It looked completely stupid in that way.. Something like this sure makes sense: $ insmod ondemand-governor.ko $ change governor to ondemand for few CPUs $ rmmod ondemand-governor.ko as we have deliberately add few users of governor. And so without second step, rmmod should really work smoothly. And it does. Now, why shouldn't there be a problem with this approach? I will write that inline to the problems you just described. > The reason is, one should not be able to unload the back-end cpufreq > driver module when some CPUs are still being managed. Nasty things > will result if we allow that. For example, if we unload the module, > and then try to do a CPU offline, then the cpufreq hotplug notifier > won't even be called (because cpufreq_unregister_driver also > unregisters the hotplug notifier). And that might be troublesome. So what? Its simply equivalent to we have booted our system, haven't inserted cpufreq module and taken out a cpu. > Even worse, if we unload a cpufreq driver module and load a new > one and *then* try to offline the CPU, then the cpufreq_driver->exit() > function that we call during CPU offline will end up calling the > corresponding function of an entirely different driver! So the > ->init() and ->exit() calls won't match. That's not true. When we unload the module, it must call cpufreq_unregister_driver() which should call cpufreq_remove_cpu() for all cpus and so exit() is already called for last module. If we get something new now, it should simply work. What do you think gentlemen? -- viresh -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html