I think it would be nice if load_module in ftrace mode will report that the driver should be recompiled. same as where it screams about version or license. So, i managed to use ftrace, problem is that i cannot get the trace out of the machine, neither by scp nor by sz. i cannot cp it since it will flush to drive. ( even though i set flush to RT 99). Here is part of the trace file. tEthernInp-1173 [000] 133.002614: sub_preempt_count <-unmask_8259A_irq tEthernInp-1173 [000] 133.002614: sub_preempt_count <-handle_level_irq tEthernInp-1173 [000] 133.002614: irq_exit <-do_IRQ tEthernInp-1173 [000] 133.002615: sub_preempt_count <-irq_exit tEthernInp-1173 [000] 133.002615: wakeup_softirqd <-irq_exit tEthernInp-1173 [000] 133.002615: wake_up_process <-wakeup_softirqd tEthernInp-1173 [000] 133.002616: try_to_wake_up <-wake_up_process tEthernInp-1173 [000] 133.002616: add_preempt_count <-try_to_wake_up tEthernInp-1173 [000] 133.002616: add_preempt_count <-try_to_wake_up tEthernInp-1173 [000] 133.002617: activate_task <-try_to_wake_up tEthernInp-1173 [000] 133.002617: enqueue_task <-activate_task tEthernInp-1173 [000] 133.002617: update_rq_clock <-enqueue_task tEthernInp-1173 [000] 133.002618: enqueue_task_rt <-enqueue_task tEthernInp-1173 [000] 133.002618: dequeue_rt_stack <-enqueue_task_rt tEthernInp-1173 [000] 133.002618: ttwu_do_wakeup.clone.139 <-try_to_wa On Sun, Feb 5, 2012 at 12:02 PM, Raz <raziebe@xxxxxxxxx> wrote: > also, this is x86-32 machine with a single Celeron processor 1.5Ghz. > > On Sun, Feb 5, 2012 at 11:57 AM, Raz <raziebe@xxxxxxxxx> wrote: >> Hey Steven >> >> This is the first time I try frace on this kernel. it turned out that >> when enabling >> ftrace kernel oops whenever one loads a module - see bellow. >> I debugged it a little and the null pointer is in trace_module_add_events >> in the loop , the call pointer is NULL. >> >> for_each_event(call, start, end) { >> call is null. >> } >> >> I will continue debugging it a bit more. >> >> >> 4.137324] e100 0000:02:03.0: eth0: NIC Link is Up 100 Mbps Full Duplex >> [ 5.180354] Disabling lock debugging due to kernel taint >> [ 5.180792] BUG: unable to handle kernel NULL pointer dereference at (null) >> [ 5.180799] IP: [<c1084f7c>] trace_module_notify+0x9c/0x160 >> [ 5.180810] *pde = 00000000 >> [ 5.180813] Oops: 0000 [#1] PREEMPT >> [ 5.180818] Modules linked in: sercos_driver(O+) >> [ 5.180822] >> [ 5.180826] Pid: 1079, comm: modprobe Tainted: G O >> 3.2.0-rc5-rt8 #8 N/A N/A /MODB >> [ 5.180832] EIP: 0060:[<c1084f7c>] EFLAGS: 00010286 CPU: 0 >> [ 5.180836] EIP is at trace_module_notify+0x9c/0x160 >> [ 5.180839] EAX: de513f44 EBX: dfeecc60 ECX: de513e0c EDX: de513e74 >> [ 5.180842] ESI: 00000000 EDI: de513edc EBP: de60df3c ESP: de60df14 >> [ 5.180846] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 >> [ 5.180849] Process modprobe (pid: 1079, ti=de60c000 task=de519030 >> task.ti=de60c000) >> [ 5.180852] Stack: >> [ 5.180854] de513e74 de513f44 de513edc de513f44 de513e0c de513e74 >> 7fbb1b30 c14578b0 >> [ 5.180859] 00000000 c14563a0 de60df58 c1055ba5 dfeecc60 00000001 >> c1455d50 00000001 >> [ 5.180865] dfeecc60 de60df78 c1055ee3 fffffffd 00000000 00000000 >> dfeecc60 0000223f >> [ 5.180870] Call Trace: >> [ 5.180878] [<c1055ba5>] notifier_call_chain+0x45/0x60 >> [ 5.180883] [<c1055ee3>] __blocking_notifier_call_chain+0x43/0x70 >> [ 5.180888] [<c1055f2f>] blocking_notifier_call_chain+0x1f/0x30 >> [ 5.180893] [<c1069682>] sys_init_module+0x62/0x1c0 >> [ 5.180900] [<c10c32c1>] ? sys_close+0x71/0xc0 >> [ 5.180906] [<c13441e1>] syscall_call+0x7/0xb >> [ 5.180909] Code: 00 8d 90 44 01 00 00 83 c0 74 89 55 e4 89 45 ec >> 8d b4 26 00 00 00 00 8b 45 e4 8b 55 ec 89 7c 24 08 8b 4d e8 89 44 24 >> 04 89 14 24 >> >> >> On Thu, Feb 2, 2012 at 4:12 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: >>> On Thu, 2012-02-02 at 12:52 +0200, Raz wrote: >>>> hey >>>> >>>> I am trying to understand why a user space MT process behaves in an >>>> unexpected manner. >>>> I have real time process, which executes most of its threads in RT priority, >>>> and from time to time a task ( medium priority ) is executing without >>>> trying to stop. >>>> >>>> There are times, that this task is hogging the entire process and no >>>> other **higher*** rt priority >>>> task is getting any cpu time. >>>> >>>> Linux is **not** getting hogged. when setting the serial console ( >>>> irq/serial ) and its shell ( /bin/sh ) >>>> to a higher priority linux is responsive. >>>> >>>> Observing the task list through ps command i can see that this task's >>>> threads when >>>> to un-interruptible sleep. >>>> >>>> Any idea why is it happening ? >>> >>> Some ideas, but I need a lot more information. >>> >>>> kernel is 3.2.0-rc5-rt8. >>> >>> Is this a x86 box? >>> >>> Can you run a trace on the task that is going berserk? >>> >>> Enable CONFIG_FUNCTION_TRACER in your kernel and do the following: >>> >>> # mount -t debugfs nodev /sys/kernel/debug >>> # cd /sys/kernel/debug/tracing >>> # echo <pid-of-berserk-task> > set_ftrace_pid >>> # echo function > current_tracer >>> # cat trace >>> >>> Thanks! >>> >>> -- Steve >>> >>> -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html