On Thu, Jan 29, 2009 at 06:00:43PM -0500, Steven Rostedt wrote: [...] > > BUG: sleeping function called from invalid context at kernel/rtmutex.c:683 > > in_atomic(): 1 [00000100], irqs_disabled(): 0, pid: 7, name: sirq-net-rx/0 > > Call Trace: > > [cf84bc20] [c0008be8] show_stack+0x4c/0x16c (unreliable) > > [cf84bc60] [c001c194] __might_sleep+0xd8/0xf8 > > [cf84bc70] [c02b7768] rt_spin_lock+0x30/0x78 > > [cf84bc80] [c00800e0] kmem_cache_alloc+0x50/0x17c > > [cf84bcb0] [c02568a4] ip_append_data+0x974/0x978 > > [cf84bd30] [c027aa0c] icmp_push_reply+0x54/0x128 > > [cf84bd50] [c027b59c] icmp_send+0x284/0x380 > > [cf84be40] [c0277328] __udp4_lib_rcv+0x3d4/0x5a0 > > [cf84bea0] [c0253208] ip_local_deliver_finish+0x74/0x128 [...] > Turn on CONFIG_PREEMPT_TRACE (not TRACER) and it should show the > location that left preemption disabled. Thank you Steven, PREEMPT_TRACE is a great tool indeed (though on PowerPC it doesn't work out of the box, but easily fixable). So, the result: --------------------------- | preempt count: 00000100 ] | 1-level deep critical section nesting: ---------------------------------------- .. [<c002d9fc>] .... local_bh_disable+0x1c/0x34 .....[<c02afef8>] .. ( <= icmp_send+0xac/0x388) icmp_send() calls icmp_xmit_lock() that disables bottom halves, then icmp_send() calls ip_append_data() that tries to allocate things with GFP_ATOMIC, which should be OK... I guess now this isn't true for -rt kernels, correct? A comment in slab.c ("which in turn implies that nobody does allocations from atomic contexts") seem to confirm this. (A bit unrelated question: If that's how things work now (i.e. GFP_ATOMIC is equal to GFP_KERNEL or vice-versa), how should we allocate things in IRQF_NODELAY/TIMER interrupts?) Anyway, this snippet fixes the issue: diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 6bccfbe..4a4862b 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -222,6 +222,9 @@ static inline struct sock *icmp_xmit_lock(struct net *net) local_bh_enable(); return NULL; } +#ifdef CONFIG_PREEMPT_RT + local_bh_enable(); +#endif return sk; } -- Now the kernel is able to boot up to the login prompt, cool! But after a while this pops up: INFO: task sirq-high/0:4 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. sirq-high/0 D 00000000 0 4 2 Call Trace: [cf839eb0] [60320800] 0x60320800 (unreliable) [cf839f70] [c0009b34] __switch_to+0x50/0x74 [cf839f90] [c02ee48c] schedule+0x19c/0x380 [cf839fd0] [c00427ac] kthread+0x34/0x8c [cf839ff0] [c001354c] kernel_thread+0x4c/0x68 --------------------------- | preempt count: 00000002 ] | 2-level deep critical section nesting: ---------------------------------------- .. [<c02ee340>] .... schedule+0x50/0x380 .....[<c00427ac>] .. ( <= kthread+0x34/0x8c) .. [<c02f0700>] .... _spin_lock_irq+0x2c/0x4c .....[<c02ee388>] .. ( <= schedule+0x98/0x380) And keeps popping up every 120 seconds, though both kernel and userspace stay alive. Thanks, -- Anton Vorontsov email: cbouatmailru@xxxxxxxxx irc://irc.freenode.net/bd2 -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html