On Friday, September 06, 2013 11:33:55 AM Srivatsa S. Bhat wrote: > On 09/05/2013 06:24 AM, Stephen Boyd wrote: > > On 09/04/13 17:26, Rafael J. Wysocki wrote: > >> On Wednesday, September 04, 2013 04:50:01 PM Stephen Boyd wrote: > >>> On 09/04/13 16:55, Rafael J. Wysocki wrote: > >>>> Well, I'm not sure when Viresh is going to be back. > >>>> > >>>> Srivatsa, can you please resend this patch with a proper changelog? > >>>> > >>> I haven't had a chance to try this out yet, but I was just thinking > >>> about this patch. How is it going to work? If one task opens the file > >>> and another task is taking down the CPU wouldn't we deadlock in the > >>> CPU_DOWN notifier waiting for the kobject to be released? Task 1 will > >>> grab the kobject reference and sleep on the hotplug mutex and task 2 > >>> will put the kobject and wait for the completion, but it won't happen. > >>> At least I think that's what would happen. > >> Do you mean the completion in sysfs_deactivate()? Yes, we can deadlock > >> there. > > > > I mean the complete in cpufreq_sysfs_release(). I don't think that will > > ever be called because the kobject is held by the task calling store() > > which is waiting on the hotplug lock to be released. > > > >> > >> Well, I guess the Srivatsa's patch may be salvaged by making it do a "trylock" > >> version of get_online_cpus(), but then I wonder if there's no better way. > > > > I think the real solution is to remove the kobject first if the CPU > > going down is the last user of that policy. Once the completion is done > > we can stop the governor and clean up state. For the case where there > > are still CPUs using the policy why can't we cancel that CPU's work and > > do nothing else? Even in the case where we have to move the cpufreq > > directory do we need to do a STOP/START/LIMITS sequence? I hope we can > > get away with just moving the directory and canceling that CPUs work then. > > > > Conceptually, I agree that your idea of not allowing any process to take > a new reference to the kobject while we are taking the CPU offline, is a > sound solution. > > However, I am reluctant to go down that path because, handling the CPU hotplug > sequence in the suspend/resume path might get even more tricky, if we want > to implement the changes that you propose. Just recently we managed to > rework the cpufreq CPU hotplug handling to retain the sysfs file permissions > around suspend/resume, and doing that was not at all simple. Adding more > quirks and complexity to the kobject handling in that path will only make > things even more challenging, IMHO. That's the reason I'm trying to think > of ways to avoid touching that fragile code, and instead solve this problem > in some other way, without compromising on the robustness of the solution. > > So here is my new proposal, as a replacement to this patch[2/2]: Actually, I've already applied patch [2/2], because while it doesn't solve the whole problem, it narrows it down somewhat. So, please work on top of that patch going forward. > We note that, at CPU_DOWN_PREPARE stage, the CPU is not yet marked offline, > whereas by the time we handle CPU_POST_DEAD, the CPU is removed from the > cpu_online_mask, and also the cpu_hotplug lock is dropped. > > So, let us split up __cpu_remove_dev() into 2 parts, say: > __cpu_remove_prepare() - invoked during CPU_DOWN_PREPARE > __cpu_remove_finish() - invoked during CPU_POST_DEAD > > > In the former function, we stop the governors, so that policy->governor_enabled > gets set to false, so that patch [1/2] will return -EBUSY to any subsequent > ->store() requests. Also, we do everything except the kobject cleanup. > > In the latter function, we do the remaining work, particularly the part > where we wait for the kobject refcount to drop to zero and the subsequent > cleanup. > > > And the ->store() functions will be modified to look like this: > > store() > { > get_online_cpus(); > > if (!cpu_online(cpu)) > goto out; > > /* Body of the function*/ > out: > put_online_cpus(); > } > > That way, if a task tries to write to a cpufreq file during CPU offline, > it will get blocked on get_online_cpus(), and will continue after > CPU_DEAD (since we release the lock here). Then it will notice that the cpu > is offline, and hence will return silently, thus dropping the kobject refcnt. > So, when the cpufreq core comes back at the CPU_POST_DEAD stage to cleanup > the kobject, it won't encounter any problems. > > Any thoughts on this approach? I'll try to code it up and post the patch > later today. It sounds good to me (modulo the remark above). Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html