On Tue, May 3, 2011 at 6:37 PM, <purusothaman.ramajothi@xxxxxxxxx> wrote: > Hi, > > Thanks for the help.According to my understanding below is the code flow for resume operation in kernel > > 1.device_resume() > 2.dpm_resume() > 3.dpm_complete() > 4.thaw_processes() > 5.thaw_tasks() > 6.thaw_process() > 7.schedule () > > I have compared the "resume hang logs" with the "working logs".During resume all my "drivers" and "devices" resumed back.The final function enters is "thaw_processes".I think the kernel hangs around these function.I confirmed this by adding checkpoints in the code flow of deep sleep.Because the logs I have don't have the printk message "Restarting...tasks done" when the kernel hangs. Well, then, more printk (pr_emerg?) messages? w/ early printk and/or low-level debugging options? Besides, if the crash log stops right before "Restarting..." than, the crash point could be earlier than thaw_processes. (or are you crashing at oom_killer_enable? well.. that still means you need more printks) > > Can u explain me in what scenarios the kernel can hang around the thaw_processes? > Please correct me if the above code flow is incorrect. - There could be a delayed fatal event triggered by earlier operations (device drivers) that happens right after entering thawing. - Last a few printk messages may be just not flushed to display at the crash point. - Even if it stopped at thaw_process, it does not necessary mean that something is wrong at thaw_processes. You already have your irqs enabled and devices are resumed. Even better, some device drivers may have "delayed resume" feature usually implemented with work struct or delayed work. > > Regards, > Purusothaman R > > > -----Original Message----- > From: MyungJoo Ham [mailto:myungjoo.ham@xxxxxxxxx] > Sent: Tuesday, May 03, 2011 12:45 PM > To: Purusothaman Ramajothi (WT01 - Manufacturing & Hi Tech) > Cc: linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx > Subject: Re: Need Help to debug resume hangs at schedule() function > > On Tue, May 3, 2011 at 3:38 PM, <purusothaman.ramajothi@xxxxxxxxx> wrote: >> Hi, >> >> Thanks for the reply.I am using 2.6.27.18 kernel.But I think [Kernel hacking] ==> [Detect Hung Tasks] is added from 2.6.30 kernel. >> >> Is there any patch that I can apply to enable the "detec hung tasks" feature in 2.6.27 kernekl? > > Clone the recent kernel. > Run git log -- kernel/hung_task.c > Pick resulting patches (and more patches if required), apply them to > yours, and resolve conflicts if exist. > >> >> Is there any other way of debugging this issue using software methods? > > Add more printks. Store the whole console output somewhere and > read/filter/search them later. > >> >> >> Regards, >> Purusothaman R >> >> >> -----Original Message----- >> From: MyungJoo Ham [mailto:myungjoo.ham@xxxxxxxxx] >> Sent: Tuesday, May 03, 2011 8:29 AM >> To: Purusothaman Ramajothi (WT01 - Manufacturing & Hi Tech) >> Cc: linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx >> Subject: Re: Need Help to debug resume hangs at schedule() function >> >> On Mon, May 2, 2011 at 10:52 PM, <purusothaman.ramajothi@xxxxxxxxx> wrote: >>> Hi, >>> >>> >>> >>> I am using 2.6.27.18 kernel.My machine is powerpc >>> architecture.I executed the deep sleep operation by executing the following >>> command: >>> >>> >>> >>> "echo mem > /sys/power/state" >>> >>> >>> >>> The suspend opeartion was successful.All the devices and user level threads >>> are suspended. >>> >>> >>> >>> But sometimes during resume my machine hangs.This is not happening >>> regularly.But it occurs after 60 or 70 times of deep sleep operation.So for >>> debugging this I have added checkpoints in the flow of deep check.When this >>> issue was produced the kernel hangs around "thaw_tasks" or "schedule" >>> function of "thaw_processes".These functions are defined in >>> "kernel/power/process.c". >>> >> >> Have you enabled [Kernel hacking] ==> [Detect Hung Tasks] and waited >> more than 120 seconds after the hang? >> >>> >>> >>> Note: The final printk message "Restarting tasks.....done" was not displayed >>> when this issue occurred.But all the devices and drivers resumed back. >>> >>> >>> >>> I was not able to debug this as the reproduciablity rate was too low.Kindly >>> help me in solving the issue. >> >> Around 1/100 is not that low for a suspend/resume issue. Don't get too >> frustrated. You will probably encounter much rarer (let's say 1/10000) >> incidents if you are doing the job with newly produced chips and >> boards. :D >> >> If you have some H/W tracers similar as T32, try it. It helps a lot. >> >>> >>> >>> >>> Regards, >>> >>> Purusothaman R >>> >>> >>> >>> Please do not print this email unless it is absolutely necessary. >>> >>> The information contained in this electronic message and any attachments to >>> this message are intended for the exclusive use of the addressee(s) and may >>> contain proprietary, confidential or privileged information. If you are not >>> the intended recipient, you should not disseminate, distribute or copy this >>> e-mail. Please notify the sender immediately and destroy all copies of this >>> message and any attachments. >>> >>> WARNING: Computer viruses can be transmitted via email. The recipient should >>> check this email and any attachments for the presence of viruses. The >>> company accepts no liability for any damage caused by any virus transmitted >>> by this email. >>> >>> www.wipro.com >>> >>> _______________________________________________ >>> linux-pm mailing list >>> linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx >>> https://lists.linux-foundation.org/mailman/listinfo/linux-pm >>> >> >> -- >> MyungJoo Ham, Ph.D. >> Mobile Software Platform Lab, >> Digital Media and Communications (DMC) Business >> Samsung Electronics >> cell: 82-10-6714-2858 >> >> Please do not print this email unless it is absolutely necessary. >> >> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. >> >> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. >> >> www.wipro.com >> > > > > -- > MyungJoo Ham, Ph.D. > Mobile Software Platform Lab, > Digital Media and Communications (DMC) Business > Samsung Electronics > cell: 82-10-6714-2858 > > Please do not print this email unless it is absolutely necessary. > > The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. > > WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. > > www.wipro.com > -- MyungJoo Ham, Ph.D. Mobile Software Platform Lab, Digital Media and Communications (DMC) Business Samsung Electronics cell: 82-10-6714-2858 _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm