Re: A query regarding sysrq-trigger

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 15, 2010 at 6:02 PM, Nagaprabhanjan Bellari
<nagp.knb@xxxxxxxxx> wrote:
> Hi,
>
> We had a problem where we were trying to debug why events/0 was taking 98%
> of CPU time. I found that writing a ‘t’ to /proc/sysrq-trigger will dump the
> stack traces of all processes. Unfortunately, events/0 was shown as running
> and no stack trace was dumped for it:
>
> ==
> ksoftirqd/0   S 00000000     0     2      1             3       (L-TLB)
> Call trace:
>  5fe8bf50 [4000556c] __switch_to+0x60/0x9c
>  5fe8bf60 [4026a37c] schedule+0x314/0x75c
>  5fe8bfa0 [40026e3c] ksoftirqd+0xb0/0xb4
>  5fe8bfc0 [4003847c] kthread+0xec/0x128
>  5fe8bff0 [40005370] kernel_thread+0x44/0x60
> Next sp 0!
>
> events/0      R running     0     3      1             4     2
> (L-TLB)  <===== no stack trace for this. :-(
>
> khelper       S 00000000     0     4      1             5     3 (L-TLB)
> Call trace:
>  5ff47ef0 [4000556c] __switch_to+0x60/0x9c
>  5ff47f00 [4026a37c] schedule+0x314/0x75c
>  5ff47f40 [400333e4] worker_thread+0x214/0x218
>  5ff47fc0 [4003847c] kthread+0xec/0x128
>  5ff47ff0 [40005370] kernel_thread+0x44/0x60
> Next sp 0!
> ==
>
>
> Can one of you tell me how to get the stack trace of a running process? Or
> any other ideas/suggestions to see what events/0 is up to?

Firstly as i understand events/0 is being overwhelmed by too many
worker threads(scheduled from some driver you wrote? in some buggy irq
handler which might be scheduling some work later in a workqueue? I
cant comment much without knowing more information...

Anyhow, for knowing who is eating what much of cpu time(also known as
profiling) you can use
a. Oprofile
OR
b. Add a simple code in your kernel's main irq handler(in do_irq) to
print the last PC seen to a proc file(create some proc entry for this)
or use printk(but turn off o/p using echo 0 > /proc/sys/kernel/printk,
and gather info from kernel printk log in /proc/kmsg).
   Once you collect the PC(program counter) samples, see in your
System.map to find where these PC samples fall in. Most definitely it
should point to the erring code.
I have done this quite many times to profile the system(quick way than
using Oprofile)
and found it to be quite effective.
   Incidentally I use FIQ(instead of IRQ, for ARM processor ofcourse)
for better PC samples as FIQs are higher priority,
  If you are using any other processor it should still work...
Feel free if you need more info....

good luck
-syed

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ



[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux