On 10/12/2011 03:26 AM, Rafael J. Wysocki wrote: > On Tuesday, October 11, 2011, Srivatsa S. Bhat wrote: >> On 10/10/2011 08:46 PM, Srivatsa S. Bhat wrote: >>> On 10/10/2011 07:56 PM, Peter Zijlstra wrote: >>>> On Mon, 2011-10-10 at 18:15 +0530, Srivatsa S. Bhat wrote: >>>>>> + /* >>>>>> + * Prevent cpu online and suspend/hibernate (including freezer) >>>>>> + * operations from running in parallel. Fail cpu online if suspend or >>>>>> + * hibernate has already started. >>>>>> + */ >>>>>> + if (!trylock_pm_sleep()) >>>>> >>>>> Would it be better to hook into the suspend/hibernate notifiers and >>>>> use them to exclude cpu hotplug from suspend/hibernate, instead of >>>>> trying to take pm_mutex lock like this? >>>>> Peter, I remember you pointing out in another patch's review >>>>> (http://thread.gmane.org/gmane.linux.kernel/1198312/focus=1199087) >>>>> that introducing more locks in cpu hotplug would be a bad idea. Does that >>>>> comment hold here as well, or is this fine? >>>> >>>> Arguably pm_mutex is already involved in the whole hotplug dance due to >>>> suspend using it, that said, I'm not at all familiar with the whole >>>> suspend/hibernate side of things. >>>> >>>> I tried having a quick look this morning but failed to find the actual >>>> code. >>>> >>>> I think it would be good to have an overview of the various locks and a >>>> small description of how they interact/nest. >>>> >>> >>> Sure. I'll put together whatever I have understood, in the form of a patch >>> to Documentation/power directory and post it tomorrow, for the benefit of >>> all. >>> >> >> Here it is, just as promised :-) >> http://lkml.org/lkml/2011/10/11/393 > > Well, I have an idea. > > Why don't we make drivers/base/cpu.c:store_online() take pm_mutex > in addition to calling cpu_hotplug_driver_lock()? This at least > will make the interface mutually exclusive with suspend/hibernation. > Oh, no no.. We shouldn't be doing that even though it seems very innocuous, because of a subtle reason: the memory hotplug code called in cpu_up() tries to acquire pm_mutex! So we will end up deadlocking ourselves, due to recursive locking! See kernel/cpu.c: cpu_up() calls mem_online_node() [defined in mm/memory_hotplug.c/mem_online_node() which calls lock_memory_hotplug() which internally calls lock_system_sleep(), which is where it tries to get pm_mutex]. So this patchset implements the mutual exclusion in the cpu_up() function (i.e., a little bit deeper down the road than store_online() ) and solves the problem. I thought I had put a comment there in my code warning about this locking order thing, but I must have removed it in the last minute, just before posting it.... -- Regards, Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> Linux Technology Center, IBM India Systems and Technology Lab -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html