On 08/01/2013 08:14 PM, Viresh Kumar wrote: > On 1 August 2013 13:41, Srivatsa S. Bhat > <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote: >> On 08/01/2013 05:38 AM, Rafael J. Wysocki wrote: >>> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> >>> >>> The cpufreq core is a little inconsistent in the way it uses the >>> driver module refcount. >>> >>> Namely, if __cpufreq_add_dev() is called for a CPU without siblings >>> or generally a CPU for which a new policy object has to be created, >>> it grabs a reference to the driver module to start with, but drops >>> that reference before returning. As a result, the driver module >>> refcount is then equal to 0 after __cpufreq_add_dev() has returned. >>> >>> On the other hand, if the given CPU is a sibling of some other >>> CPU already having a policy, cpufreq_add_policy_cpu() is called >>> to link the new CPU to the existing policy. In that case, >>> cpufreq_cpu_get() is called to obtain that policy and grabs a >>> reference to the driver module, but that reference is not >>> released and the module refcount will be different from 0 after >>> __cpufreq_add_dev() returns (unless there is an error). That >>> prevents the driver module from being unloaded until >>> __cpufreq_remove_dev() is called for all the CPUs that >>> cpufreq_add_policy_cpu() was called for previously. >>> >>> To remove that inconsistency make cpufreq_add_policy_cpu() execute >>> cpufreq_cpu_put() for the given policy before returning, which >>> decrements the driver module refcount so that it will be 0 after >>> __cpufreq_add_dev() returns, >> >> Removing the inconsistency is a good thing, but I think we should >> make it consistent the other way around - make a CPU-online increment >> the driver module refcount and decrement it only on CPU-offline. > > I took time to review to this mail as I was looking at the problem > yesterday. I am sorry to say, but I have completely different views as > compared to You and Rafael both :) > > First of all, Rafael's patch is incomplete as it hasn't fixed the issue > completely. When we have multiple CPUs per policy and > cpufreq_add_dev() is called for the first one, it call cpufreq_get_cpu() > for all cpus of this policy(), so count is == x (no. of CPUs in this policy) > + 1 (This is the initial value of .owner). > > And so we still have module count getting incremented for other cpus :) > Good catch! > Now few lines about My point of view to this whole thing. I believe we > should get rid of .owner field from struct cpufreq_driver completely. It > doesn't make sense to me in doing all this management at all. Surprised? > Shocked? Laughing at me? :) > > Well I may be wrong but this is what I think: > - It looks stupid to me that I can't do this from userspace in one go: > $ insmod cpufreq_driver.ko > $ rmmod cpufreq_driver.ko > > What the hell changed in between that isn't visible to user? It looked > completely stupid in that way.. > > Something like this sure makes sense: > $ insmod ondemand-governor.ko > $ change governor to ondemand for few CPUs > $ rmmod ondemand-governor.ko > > as we have deliberately add few users of governor. And so without second > step, rmmod should really work smoothly. And it does. > > Now, why shouldn't there be a problem with this approach? I will write > that inline to the problems you just described. > >> The reason is, one should not be able to unload the back-end cpufreq >> driver module when some CPUs are still being managed. Nasty things >> will result if we allow that. For example, if we unload the module, >> and then try to do a CPU offline, then the cpufreq hotplug notifier >> won't even be called (because cpufreq_unregister_driver also >> unregisters the hotplug notifier). And that might be troublesome. > > So what? Its simply equivalent to we have booted our system, haven't > inserted cpufreq module and taken out a cpu. > >> Even worse, if we unload a cpufreq driver module and load a new >> one and *then* try to offline the CPU, then the cpufreq_driver->exit() >> function that we call during CPU offline will end up calling the >> corresponding function of an entirely different driver! So the >> ->init() and ->exit() calls won't match. > > That's not true. When we unload the module, it must call > cpufreq_unregister_driver() which should call cpufreq_remove_cpu() > for all cpus and so exit() is already called for last module. > Sorry, I missed this one. > If we get something new now, it should simply work. > Yeah, I now see your point. It won't create any problems by unloading the module and loading a new one. > What do you think gentlemen? > Well, I now agree that we don't have to keep the module refcount non-zero as long as CPUs are being managed (that was just my misunderstanding, sorry for the noise). However, I think the _get() and _put() used in the existing code is for synchronization: that is, to avoid races between trying to unload the cpufreq driver module and running a hotplug notifier (and calling the driver module's ->init() or ->exit() function). With that being the case, I think we can retain the module refcounts and use them only for synchronization. And naturally the refcount should drop to zero after the critical section; no point keeping it incremented until the CPU is taken offline. Or, am I confused again? Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html