Re: jprobes for 'enqueue_entity' and 'dequeue_entity' causes kernel hang on spinlock

Mulyadi Santosa <mulyadi.santosa@xxxxxxxxx> · Fri, 6 Mar 2009 21:21:33 +0700

Hi...

On Fri, Mar 6, 2009 at 4:57 PM, Sukanto Ghosh
<sukanto.cse.iitb@xxxxxxxxx> wrote:
> Hi,
>
> I have to log when a process gets in/out of the runqueue. For the same
> I took the jprobes example code from Documentation/kprobes.txt and
> modified it to probe enqueue_entity() and dequeue_entity() . When I
> insmod my module the kernel hangs. On running with a remote gdb I
> could figure out that it is spinning on a spinlock.
>
> I have used printks in my module code.  It is interesting to note that
> wake_up_klogd() is being called twice in this chain and I think it is
> causing the trouble, as it is spinning on a lock which it has already
> locked earlier.  Is my analysis right ?  Am I doing anything wrong
> here ?
>
>
> #0  0xc04180fa in __ticket_spin_lock (lock=0xc0740ea4) at
> include/asm/spinlock.h:75
> #1  0xc0639667 in _spin_lock_irqsave (lock=0x1e1d) at
> include/asm/paravirt.h:1401
> #2  0xc041dfc3 in __wake_up (q=0xc0740ea4, mode=7709, nr_exclusive=1,
> key=0x0) at kernel/sched.c:4585
> #3  0xc0429eb5 in wake_up_klogd () at kernel/printk.c:988
> #4  0xc042a03d in release_console_sem () at kernel/printk.c:1036
> #5  0xc042a49e in vprintk (fmt=0xe0a370d0 "<6>probe[enqueue_entity]:
> [pid:%u]\n", args=0xde98edd4 "\215\006") at kernel/printk.c:771
> #6  0xc063762d in printk (fmt=0xe0a370d0 "<6>probe[enqueue_entity]:
> [pid:%u]\n") at kernel/printk.c:604
> #7  0xe0a37041 in ?? ()
> #8  0xc042041a in enqueue_task_fair (rq=0xc140a580, p=<value optimized
> out>, wakeup=1) at kernel/sched_fair.c:928
> #9  0xc041c540 in enqueue_task (rq=0xc140a580, p=0xdecb8000, wakeup=1)
> at kernel/sched.c:1644
> #10 0xc041c67d in activate_task (rq=0xc140a580, p=0x1e1d, wakeup=1) at
> kernel/sched.c:1715
> #11 0xc0424133 in try_to_wake_up (p=0xdecb8000, state=1, sync=0) at
> kernel/sched.c:2281
> #12 0xc04241bc in default_wake_function (curr=<value optimized out>,
> mode=7709, sync=1, key=0x0) at kernel/sched.c:4546
> #13 0xc043a1ea in autoremove_wake_function (wait=0xc0740ea4,
> mode=7709, sync=1, key=0x0) at kernel/wait.c:132
> #14 0xc041cb30 in __wake_up_common (q=<value optimized out>, mode=1,
> nr_exclusive=1, sync=0, key=0x0) at kernel/sched.c:4567
> #15 0xc041dfd6 in __wake_up (q=0xc0740ea4, mode=1, nr_exclusive=1,
> key=0x0) at kernel/sched.c:4586
> #16 0xc0429eb5 in wake_up_klogd () at kernel/printk.c:988
> #17 0xc042a03d in release_console_sem () at kernel/printk.c:1036
> #18 0xc042a49e in vprintk (fmt=0xe0a3712d "<6>Planted jprobe at %p,
> handler addr %p\n", args=0xde98ef3c "�\001B�%p����\230�%\021@�")
>    at kernel/printk.c:771
> #19 0xc063762d in printk (fmt=0xe0a3712d "<6>Planted jprobe at %p,
> handler addr %p\n") at kernel/printk.c:604
> #20 0xe089802f in ?? ()
> #21 0xc0401125 in do_one_initcall (fn=0xe0898000) at init/main.c:715
> #22 0xc0449167 in sys_init_module (umod=0x9e2c018, len=94352,
> uargs=0x9e2c008 "") at kernel/module.c:2291

Looking at this stack frame, I think here's what happen:
your module is inserted, then it registers jprobe. It makes every
entity queueing fires up your probe, which is printk-ing the name of
the task (CMIIW here).

the problem is, module insertion itself is doing printk(), which is
eventually kicking off klogd to wake up, that also fires your probe.
So, in one chain, two printk() try to acquire console spinlocks.

If you agree with me, try to eliminate all the printk()s in module
initialization function and see if it helps.

regards,

Mulyadi.

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ