Re: Need Help to debug resume hangs at schedule() function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

	Thanks for the reply.I reproduced the issue.What happened is when I try to resume, the kernel resumed but somehow suspend operation has been triggered and it immediately went in to deep_sleep.During this time it enters in to "try_to_freeze_tasks" to freeze all the user space threads.But it hanges after some time when it is trying to resume a user space thread.This I confirmed by adding the printk in the "try_to_freeze_tasks" function.

Is anyone have faced like this issue? 


Regards,
Purusothaman R

-----Original Message-----
From: MyungJoo Ham [mailto:myungjoo.ham@xxxxxxxxx] 
Sent: Wednesday, May 04, 2011 6:14 AM
To: Purusothaman Ramajothi (WT01 - Manufacturing & Hi Tech)
Cc: linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
Subject: Re:  Need Help to debug resume hangs at schedule() function

On Tue, May 3, 2011 at 6:37 PM,  <purusothaman.ramajothi@xxxxxxxxx> wrote:
> Hi,
>
>        Thanks for the help.According to my understanding below is the code flow for resume operation in kernel
>
> 1.device_resume()
> 2.dpm_resume()
> 3.dpm_complete()
> 4.thaw_processes()
> 5.thaw_tasks()
> 6.thaw_process()
> 7.schedule ()
>
> I have compared the "resume hang logs" with the "working logs".During resume all my "drivers" and "devices" resumed back.The final function enters is "thaw_processes".I think the kernel hangs around these function.I confirmed this by adding checkpoints in the code flow of deep sleep.Because the logs I have don't have the printk message "Restarting...tasks done" when the kernel hangs.

Well, then, more printk (pr_emerg?) messages? w/ early printk and/or
low-level debugging options?

Besides, if the crash log stops right before "Restarting..." than, the
crash point could be earlier than thaw_processes.
(or are you crashing at oom_killer_enable? well.. that still means you
need more printks)

>
> Can u explain me in what scenarios the kernel can hang around the thaw_processes?
> Please correct me if the above code flow is incorrect.

- There could be a delayed fatal event triggered by earlier operations
(device drivers) that happens right after entering thawing.
- Last a few printk messages may be just not flushed to display at the
crash point.
- Even if it stopped at thaw_process, it does not necessary mean that
something is wrong at thaw_processes. You already have your irqs
enabled and devices are resumed. Even better, some device drivers may
have "delayed resume" feature usually implemented with work struct or
delayed work.

>
> Regards,
> Purusothaman R
>
>
> -----Original Message-----
> From: MyungJoo Ham [mailto:myungjoo.ham@xxxxxxxxx]
> Sent: Tuesday, May 03, 2011 12:45 PM
> To: Purusothaman Ramajothi (WT01 - Manufacturing & Hi Tech)
> Cc: linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
> Subject: Re:  Need Help to debug resume hangs at schedule() function
>
> On Tue, May 3, 2011 at 3:38 PM,  <purusothaman.ramajothi@xxxxxxxxx> wrote:
>> Hi,
>>
>>        Thanks for the reply.I am using 2.6.27.18 kernel.But I think [Kernel hacking] ==> [Detect Hung Tasks] is added from 2.6.30 kernel.
>>
>> Is there any patch that I can apply to enable the "detec hung tasks" feature in 2.6.27 kernekl?
>
> Clone the recent kernel.
> Run git log -- kernel/hung_task.c
> Pick resulting patches (and more patches if required), apply them to
> yours, and resolve conflicts if exist.
>
>>
>> Is there any other way of debugging this issue using software methods?
>
> Add more printks. Store the whole console output somewhere and
> read/filter/search them later.
>
>>
>>
>> Regards,
>> Purusothaman R
>>
>>
>> -----Original Message-----
>> From: MyungJoo Ham [mailto:myungjoo.ham@xxxxxxxxx]
>> Sent: Tuesday, May 03, 2011 8:29 AM
>> To: Purusothaman Ramajothi (WT01 - Manufacturing & Hi Tech)
>> Cc: linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
>> Subject: Re:  Need Help to debug resume hangs at schedule() function
>>
>> On Mon, May 2, 2011 at 10:52 PM,  <purusothaman.ramajothi@xxxxxxxxx> wrote:
>>> Hi,
>>>
>>>
>>>
>>>                 I am using 2.6.27.18 kernel.My machine is powerpc
>>> architecture.I executed the deep sleep operation by executing the following
>>> command:
>>>
>>>
>>>
>>> "echo mem > /sys/power/state"
>>>
>>>
>>>
>>> The suspend opeartion was successful.All the devices and user level threads
>>> are suspended.
>>>
>>>
>>>
>>> But sometimes during resume my machine hangs.This is not happening
>>> regularly.But it occurs after 60 or 70 times of deep sleep operation.So for
>>> debugging this I have added checkpoints in the flow of deep check.When this
>>> issue was produced the kernel hangs around "thaw_tasks" or "schedule"
>>> function of "thaw_processes".These functions are defined in
>>> "kernel/power/process.c".
>>>
>>
>> Have you enabled [Kernel hacking] ==> [Detect Hung Tasks] and waited
>> more than 120 seconds after the hang?
>>
>>>
>>>
>>> Note: The final printk message "Restarting tasks.....done" was not displayed
>>> when this issue occurred.But all the devices and drivers resumed back.
>>>
>>>
>>>
>>> I was not able to debug this as the reproduciablity rate was too low.Kindly
>>> help me in solving the issue.
>>
>> Around 1/100 is not that low for a suspend/resume issue. Don't get too
>> frustrated. You will probably encounter much rarer (let's say 1/10000)
>> incidents if you are doing the job with newly produced chips and
>> boards. :D
>>
>> If you have some H/W tracers similar as T32, try it. It helps a lot.
>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Purusothaman R
>>>
>>>
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any attachments to
>>> this message are intended for the exclusive use of the addressee(s) and may
>>> contain proprietary, confidential or privileged information. If you are not
>>> the intended recipient, you should not disseminate, distribute or copy this
>>> e-mail. Please notify the sender immediately and destroy all copies of this
>>> message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The recipient should
>>> check this email and any attachments for the presence of viruses. The
>>> company accepts no liability for any damage caused by any virus transmitted
>>> by this email.
>>>
>>> www.wipro.com
>>>
>>> _______________________________________________
>>> linux-pm mailing list
>>> linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
>>> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>>>
>>
>> --
>> MyungJoo Ham, Ph.D.
>> Mobile Software Platform Lab,
>> Digital Media and Communications (DMC) Business
>> Samsung Electronics
>> cell: 82-10-6714-2858
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>>
>> www.wipro.com
>>
>
>
>
> --
> MyungJoo Ham, Ph.D.
> Mobile Software Platform Lab,
> Digital Media and Communications (DMC) Business
> Samsung Electronics
> cell: 82-10-6714-2858
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com
>



-- 
MyungJoo Ham, Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com
_______________________________________________
linux-pm mailing list
linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/linux-pm



[Index of Archives]     [Linux ACPI]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [CPU Freq]     [Kernel Newbies]     [Fedora Kernel]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux