Hi, I'm testing a module that implement a new scheduler. If I abilitate the high risolution timer the kernel deadlock with this output: schedtest@dhcp046:~/src/dynrsv/examples$ ./schedtest -q 1000 -t 10000 Setting schedulestopped custom tracer. ======================================================= [ INFO: possible circular locking dependency detected ] [ 2.6.21.4-rt12-cfs-v17-whrt #11 ------------------------------------------------------- schedtest/2815 is trying to acquire lock: (&cpu_base->lock_key){+...}, at: [<c013a4f3>] lock_hrtimer_base +0x18/0x33 but task is already holding lock: (&rq->rq_lock_key){+...}, at: [<c011f965>] generic_sched_setscheduler +0x76/0x13 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&rq->rq_lock_key){+...}: [<c01414a0>] __lock_acquire+0xa17/0xba8 [<c0141699>] lock_acquire+0x68/0x82 [<c02bd844>] __spin_lock+0x35/0x42 [<c011e9ae>] task_rq_lock+0x36/0x5d [<c01203a7>] try_to_wake_up+0x27/0x368 [<c012078b>] wake_up_process+0x19/0x1b [<c013a146>] hrtimer_wakeup+0x18/0x1c [<c013ab16>] hrtimer_interrupt+0x110/0x19a [<c0114afc>] smp_apic_timer_interrupt+0x6c/0x7e [<c0104ac3>] apic_timer_interrupt+0x33/0x38 [<c02bc8c1>] rt_spin_lock_slowlock+0x58/0x17a [<c02bd076>] rt_spin_lock+0x2e/0x56 [<c01646e8>] unmap_vmas+0x466/0x4b4 [<c0167144>] exit_mmap+0x8d/0x112 [<c01238e9>] mmput+0x35/0x95 [<c012757e>] exit_mm+0xef/0xf4 [<c012891d>] do_exit+0x1e5/0x76c [<c0128f23>] sys_exit_group+0x0/0x11 [<c0128f32>] sys_exit_group+0xf/0x11 [<c0103ffc>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff -> #0 (&cpu_base->lock_key){+...}: [<c0141384>] __lock_acquire+0x8fb/0xba8 [<c0141699>] lock_acquire+0x68/0x82 [<c02bd5c5>] __spin_lock_irqsave+0x3e/0x4e [<c013a4f3>] lock_hrtimer_base+0x18/0x33 [<c013a5e5>] hrtimer_start+0x19/0xfb [<d097d916>] sched_set_timer+0x2d/0x32 [cbs_sched] [<d097d5d5>] cbs_start_accounting+0x57/0x5e [cbs_sched] [<d097d7ef>] generic_request+0x40/0x49 [cbs_sched] [<c011d9d3>] enqueue_task+0xb3/0xc2 [<c011d9f0>] activate_task+0xe/0x1b [<c011f9de>] generic_sched_setscheduler+0xef/0x153 [<c011fcfc>] do_sched_setscheduler+0x8c/0xc0 [<c011fd5d>] sys_sched_setscheduler+0x1a/0x1c [<c0103f70>] sysenter_past_esp+0x5d/0x99 [<ffffffff>] 0xffffffff other info that might help us debug this: 2 locks held by schedtest/2815: #0: ((raw_spinlock_t *)(&p->pi_lock)){....}, at: [<c011f947>] generic_sched_s3 #1: (&rq->rq_lock_key){+...}, at: [<c011f965>] generic_sched_setscheduler+0x73 stack backtrace: [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c01056c8>] show_trace+0x12/0x14 [<c010575b>] dump_stack+0x16/0x18 [<c013fb80>] print_circular_bug_tail+0x5f/0x68 [<c0141384>] __lock_acquire+0x8fb/0xba8 [<c0141699>] lock_acquire+0x68/0x82 [<c02bd5c5>] __spin_lock_irqsave+0x3e/0x4e [<c013a4f3>] lock_hrtimer_base+0x18/0x33 [<c013a5e5>] hrtimer_start+0x19/0xfb [<d097d916>] sched_set_timer+0x2d/0x32 [cbs_sched] [<d097d5d5>] cbs_start_accounting+0x57/0x5e [cbs_sched] [<d097d7ef>] generic_request+0x40/0x49 [cbs_sched] [<c011d9d3>] enqueue_task+0xb3/0xc2 [<c011d9f0>] activate_task+0xe/0x1b [<c011f9de>] generic_sched_setscheduler+0xef/0x153 [<c011fcfc>] do_sched_setscheduler+0x8c/0xc0 [<c011fd5d>] sys_sched_setscheduler+0x1a/0x1c [<c0103f70>] sysenter_past_esp+0x5d/0x99 ======================= BUG: scheduling while atomic: schedtest/0x00000000/2815, CPU#0 INFO: lockdep is turned off. irq event stamp: 6602 hardirqs last enabled at (6601): [<c0140688>] trace_hardirqs_on+0xb/0xd hardirqs last disabled at (6602): [<c013f06c>] trace_hardirqs_off +0xb/0xd softirqs last enabled at (0): [<c0123cc7>] copy_process+0x35a/0x127f softirqs last disabled at (0): [<00000000>] 0x0 [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c01056c8>] show_trace+0x12/0x14 [<c010575b>] dump_stack+0x16/0x18 [<c02baf48>] __sched_text_start+0xe0/0xc6e [<c0104096>] work_resched+0x6/0x1d ======================= r (1000 10000) BUG: sleeping function called from invalid context schedtest(2815) at kernel/rt3 in_atomic():1 [ffffffff], irqs_disabled():0 INFO: lockdep is turned off. [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c01056c8>] show_trace+0x12/0x14 [<c010575b>] dump_stack+0x16/0x18 [<c011e894>] __might_sleep+0x107/0x10d [<c02bd06f>] rt_spin_lock+0x27/0x56 [<c0130918>] force_sig_info+0x20/0x81 [<c011aa27>] force_sig_info_fault+0x21/0x23 [<c011af8e>] do_page_fault+0x53a/0x544 [<c02be394>] error_code+0x7c/0x84 ======================= BUG: sleeping function called from invalid context schedtest(2815) at kernel/rt3 in_atomic():1 [ffffffff], irqs_disabled():0 INFO: lockdep is turned off. [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c01056c8>] show_trace+0x12/0x14 [<c010575b>] dump_stack+0x16/0x18 [<c011e894>] __might_sleep+0x107/0x10d [<c02bd06f>] rt_spin_lock+0x27/0x56 [<c017251f>] kmem_cache_alloc+0x4b/0xb5 [<c012f92c>] __sigqueue_alloc+0x36/0x5d [<c012f988>] send_signal+0x35/0xe9 [<c0130734>] specific_send_sig_info+0x58/0x89 [<c013095f>] force_sig_info+0x67/0x81 [<c011aa27>] force_sig_info_fault+0x21/0x23 [<c011af8e>] do_page_fault+0x53a/0x544 [<c02be394>] error_code+0x7c/0x84 ======================= BUG: sleeping function called from invalid context schedtest(2815) at kernel/rt3 in_atomic():1 [ffffffff], irqs_disabled():0 INFO: lockdep is turned off. [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c01056c8>] show_trace+0x12/0x14 [<c010575b>] dump_stack+0x16/0x18 [<c011e894>] __might_sleep+0x107/0x10d [<c02bd06f>] rt_spin_lock+0x27/0x56 [<c0171cb6>] cache_alloc_refill+0x58/0x5c8 [<c0172556>] kmem_cache_alloc+0x82/0xb5 [<c012f92c>] __sigqueue_alloc+0x36/0x5d [<c012f988>] send_signal+0x35/0xe9 [<c0130734>] specific_send_sig_info+0x58/0x89 [<c013095f>] force_sig_info+0x67/0x81 [<c011aa27>] force_sig_info_fault+0x21/0x23 [<c011af8e>] do_page_fault+0x53a/0x544 [<c02be394>] error_code+0x7c/0x84 ======================= BUG: sleeping function called from invalid context schedtest(2815) at kernel/rt3 in_atomic():1 [ffffffff], irqs_disabled():0 INFO: lockdep is turned off. [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c01056c8>] show_trace+0x12/0x14 [<c010575b>] dump_stack+0x16/0x18 [<c011e894>] __might_sleep+0x107/0x10d [<c02bd06f>] rt_spin_lock+0x27/0x56 [<c0171ed2>] cache_alloc_refill+0x274/0x5c8 [<c0172556>] kmem_cache_alloc+0x82/0xb5 [<c012f92c>] __sigqueue_alloc+0x36/0x5d [<c012f988>] send_signal+0x35/0xe9 [<c0130734>] specific_send_sig_info+0x58/0x89 [<c013095f>] force_sig_info+0x67/0x81 [<c011aa27>] force_sig_info_fault+0x21/0x23 [<c011af8e>] do_page_fault+0x53a/0x544 [<c02be394>] error_code+0x7c/0x84 ======================= BUG: scheduling while atomic: schedtest/0x00000000/2815, CPU#0 INFO: lockdep is turned off. irq event stamp: 6602 hardirqs last enabled at (6601): [<c0140688>] trace_hardirqs_on+0xb/0xd hardirqs last disabled at (6602): [<c013f06c>] trace_hardirqs_off +0xb/0xd softirqs last enabled at (0): [<c0123cc7>] copy_process+0x35a/0x127f softirqs last disabled at (0): [<00000000>] 0x0 [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c01056c8>] show_trace+0x12/0x14 [<c010575b>] dump_stack+0x16/0x18 [<c02baf48>] __sched_text_start+0xe0/0xc6e [<c02bbcbc>] schedule+0xe5/0x104 [<c02bc95b>] rt_spin_lock_slowlock+0xf2/0x17a [<c02bd076>] rt_spin_lock+0x2e/0x56 [<c0172179>] cache_alloc_refill+0x51b/0x5c8 [<c0172556>] kmem_cache_alloc+0x82/0xb5 [<c012f92c>] __sigqueue_alloc+0x36/0x5d [<c012f988>] send_signal+0x35/0xe9 [<c0130734>] specific_send_sig_info+0x58/0x89 [<c013095f>] force_sig_info+0x67/0x81 [<c011aa27>] force_sig_info_fault+0x21/0x23 [<c011af8e>] do_page_fault+0x53a/0x544 [<c02be394>] error_code+0x7c/0x84 ======================= ------------[ cut here ]------------ kernel BUG at kernel/exit.c:866! invalid opcode: 0000 [#1] PREEMPT SMP Modules linked in: cbs_sched ipv6 ppdev lp button ac battery dm_snapshot dm_mirn CPU: 0 EIP: 0060:[<c0128794>] Not tainted VLI EFLAGS: 00010206 (2.6.21.4-rt12-cfs-v17-whrt #11) EIP is at do_exit+0x5c/0x76c eax: cc0da000 ebx: ced98610 ecx: cc1091d0 edx: 00000000 esi: 0000000b edi: cc1091d0 ebp: cc0daeb0 esp: cc0dae7c ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 preempt:00000000 Process schedtest (pid: 2815, ti=cc0da000 task=cc1091d0 task.ti=cc0da000) Stack: cc0230a4 cc023090 c02bc7b7 cc023090 0000000b cc0dae98 c0140688 cc0daea8 00000292 cc023090 ced98610 0000000b cc022c58 cc0daec4 c0128f23 0000000b 0000000a cc022c58 cc0daeec c0130f42 cc0dafb8 cc0daf90 cc0daf10 cc1096b4 Call Trace: [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c0105099>] show_stack_log_lvl+0xa2/0xb6 [<c010529e>] show_registers+0x1f1/0x2f6 [<c01054c5>] die+0x122/0x23a [<c010566f>] do_trap+0x92/0xaa [<c0105edb>] do_invalid_op+0x97/0xa1 [<c02be394>] error_code+0x7c/0x84 [<c0128f23>] sys_exit_group+0x0/0x11 [<c0130f42>] get_signal_to_deliver+0x39f/0x3de [<c01035fa>] do_notify_resume+0xad/0x6c0 [<c01040c0>] work_notifysig+0x13/0x1b ======================= INFO: lockdep is turned off. Code: 00 00 c7 44 24 04 1b 8e 33 c0 c7 04 24 88 0a 33 c0 e8 9b de ff ff e8 c1 c EIP: [<c0128794>] do_exit+0x5c/0x76c SS:ESP 0068:cc0dae7c Kernel panic - not syncing: Fatal exception in interrupt [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f [<c01056c8>] show_trace+0x12/0x14 [<c010575b>] dump_stack+0x16/0x18 [<c0125b17>] panic+0x50/0xfb [<c01055a8>] die+0x205/0x23a [<c010566f>] do_trap+0x92/0xaa [<c0105edb>] do_invalid_op+0x97/0xa1 [<c02be394>] error_code+0x7c/0x84 [<c0128f23>] sys_exit_group+0x0/0x11 [<c0130f42>] get_signal_to_deliver+0x39f/0x3de Message from sy [<c01035fa>] slogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... do_notify_resume+0xad/0x6c0 dhcp046 kernel: [<c01040c0>] work_notifysig+0x13/0x1b ======================= ------------[ cut here ]------------ Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: invalid opcode: 0000 [#1] Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: PREEMPT SMP Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: CPU: 0 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: EIP: 0060:[<c0128794>] Not tainted VLI Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: EFLAGS: 00010206 (2.6.21.4-rt12-cfs-v17-whrt #11) Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: EIP is at do_exit+0x5c/0x76c Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: eax: cc0da000 ebx: ced98610 ecx: cc1091d0 edx: 00000000 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: esi: 0000000b edi: cc1091d0 ebp: cc0daeb0 esp: cc0dae7c Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 preempt:0000 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: Call Trace: Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: [<c0104fe2>] show_trace_log_lvl+0x1a/0x2f Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: 0000000a cc022c58 cc0daeec c0130f42 cc0dafb8 cc0daf90 cc Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: [<c0105099>] show_stack_log_lvl+0xa2/0xb6 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: [<c01054c5>] die+0x122/0x23a Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: [<c010529e>] show_registers+0x1f1/0x2f6 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: 00000292 cc023090 ced98610 0000000b cc022c58 cc0daec4 c0 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: Stack: cc0230a4 cc023090 c02bc7b7 cc023090 0000000b cc0dae98 c0 dhcp046 kernel: [<c010566f>] do_trap+0x92/0xaa Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: [<c0105edb>] do_invalid_op+0x97/0xa1 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: [<c02be394>] error_code+0x7c/0x84 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... kernel: [<c0130f42>] get_signal_to_deliver+0x39f/0x3dedhcp046 kernel: [<c01281 dhcp046 kernel: [<c01035fa>] do_notify_resume+0xad/0x6c0 Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: [<c01040c0>] work_notifysig+0x13/0x1b Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: ======================= Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: Code: 00 00 c7 44 24 04 1b 8e 33 c0 c7 04 24 88 0a 33 c0 e8 9b Message from syslogd@dhcp046 at Thu Feb 5 14:11:15 2009 ... dhcp046 kernel: EIP: [<c0128794>] do_exit+0x5c/0x76c SS:ESP 0068:cc0dae7c Message from syslogd@dhcp046 at Thu Feb 5 14:11:14 2009 ... dhcp046 kernel: Process schedtest (pid: 2815, ti=cc0da000 task=cc1091d0 task.ti) Message from syslogd@dhcp046 at Thu Feb 5 14:11:15 2009 ... dhcp046 kernel: Kernel panic - not syncing: Fatal exception in interrupt Could someone help me reading this log? How can I in which function are the processes when deadlock happens? Thank you for your help Nicola -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ