On 07/02/2012 10:57 AM, preeti wrote: > On 07/02/2012 10:34 AM, Srivatsa S. Bhat wrote: >> On 07/02/2012 10:07 AM, preeti wrote: >>> On 06/30/2012 03:41 AM, Rafael J. Wysocki wrote: >>>> On Friday, June 29, 2012, preeti wrote: >>>>> On 06/29/2012 09:41 AM, preeti wrote: >>>>>> On 06/29/2012 12:41 AM, Rafael J. Wysocki wrote: >>>>>>> On Thursday, June 28, 2012, preeti wrote: >>>>>>>> On 06/28/2012 03:23 PM, Rafael J. Wysocki wrote: >>>>>>>>> On Thursday, June 28, 2012, preeti wrote: >>>>>>>>>> >>>>>>>>>> From: Preeti U Murthy <preeti@xxxxxxxxxxxxxxxxxx> >>>>>>> [...] >>>>>>>> cpuidle is an architecture independent part of the kernel code.Since >>>>>>>> this patch aims at x86 architecture in specific,I considered it >>>>>>>> inappropriate. >>>>>>>> >>>>>>>> In addition to this,suspend happens on x86 only if ACPI is configured. >>>>>>> >>>>>>> But that is not required for intel_idle, so if it hangs with intel_idle, >>>>>>> then it is not dependent on ACPI after all. >>>>>> True intel_idle does not need ACPI to be configured,but that also means >>>>>> that the kernel will not provide you the means to suspend.There is no >>>>>> question of resume hang here at all as you cannot suspend in the first >>>>>> place. >>>>>> >>>>>> The issue is when ACPI is configured,and intel_idle is chosen to be the >>>>>> cpuidle driver.In this situation when the user suspends,cpus can enter >>>>>> deep sleep states as intel_idle driver does not prevent then from doing so. >>>>>> This is when resume hangs. >>>>>>> >>>>>>>> Therefore it seemed right to put the callback in ACPI specific code >>>>>>>> which deals with ACPI sleep support. >>>>>>> >>>>>>> I wonder if we can address this issue correctly. That is, in a non-racy >>>>>>> way and in a better place. >>>>>>> >>>>>>> First, I really don't think it is necessary to "suspend" cpuidle (be it >>>>>>> ACPI or any other) when device drivers' suspend routines are being >>>>>>> executed (which also is racy, because the cpuidle "suspend" may be running >>>>>>> concurrently with cpuidle on another CPU) or earlier. We really may want >>>>>>> to disable the deeper C-states when we're about to execute >>>>>>> suspend_ops->prepare_late(), or hibernation_ops->prepare(), i.e. after >>>>>>> we've run dpm_suspend_end() successfully. >>>>>> >>>>>> The commit "ACPI:disable lower idle C-states across suspend/resume" >>>>>> states that device_suspend() calls ACPI suspend functions which cause >>>>>> side effects on the lower idle C-states.This means we need to disable >>>>>> entry into deeper C-states even before dpm_suspend_start(),but how much >>>>>> before? >>>>>> >>>>>> The commit answers this too.It says removing the functionality of >>>>>> entering deep C-states before suspend removed the side effects.Besides, >>>>>> the commit title says'across suspend/resume'.So I think to address the >>>>>> resume hang effectively,it is desirable to disable entry into deeper >>>>>> C-states during suspend_prepare operations. >>>>> >>>>> To clarify this further,since we take action upon PM_SUSPEND_PREPARE >>>>> notification,which is called before suspend begins,we avoid race >>>>> condition between suspend operations and disabling entry into deeper >>>>> c-states altogether. >>>> >>>> Well, what about races between disabling deeper C-states and cpuidle? >>> >>> Yes.The question still remains about the cpus that have already entered >>> deep C-states even before suspend routines have begun.We are not taking >>> precautions to prevent them from going into idle. >>> >> >> Actually we need *not* take such precautions. See below. >> >>> If the resume hang does depend on the cpus being in deep C-state,even >>> after the fix with acpi_idle_suspend, there should have been a hang >>> in scenarios where the cpus have already entered deep C-states before >>> suspend has begun. >>> >> >> Nope, that won't happen because we have CPU hotplug in between. The suspend >> path goes through CPU hotplug (cpu offline), and one of the phases of the >> cpu offline operation requires that the cpu that is going down runs the >> CPU_DYING_FROZEN callbacks. No other cpu can execute that. So even if a cpu >> was in a deep C-state, it would be kicked out of cpu idle and will run >> these callbacks during cpu hotplug. Its enough if we ensure that it doesn't >> enter deep C-states again, *after* the cpu hotplug operation. And the flag you >> are using or the callback method that Rafael suggested looks sufficient for >> ensuring that. >> >> So, we need not break our heads on too many race conditions here :-) > > But let us note that without the acpi_idle_suspend check,it was at the > device layer,that the hang was happening.Before cpu hotplug even begins. > acpi_idle_suspend check has nothing to do with hang at the device layer, IIRC. The hang at device layer was because we were coming out of cpu idle without enabling interrupts. And Deepthi already fixed that issue (commit 75cc523 upstream). So the problem that still remains is the _resume_ hang when using intel idle as the idle driver. And not allowing cpus to be in deep C-states while doing suspend should fix that. Regards, Srivatsa S. Bhat