Re: interrupt/tasklet issue in custom driver on recent kernels

Peter Teoh <htmldeveloper@xxxxxxxxx> · Thu, 10 Sep 2009 14:35:07 +0800

How about this:

use "top" to confirm which process is the one causing the 100% CPU
utilization?   Then enter "echo t > /proc/sysrq-trigger" to generate a
stack trace dump of all the kernel threads.   It should overflow your
dmesg buffer, so look into /var/log/messages for the complete trace.
And if it is scheduling related, *I think* it should show something
(as an example from the web, my guess :-)):

[ 2368.256791] Call Trace:
[ 2368.256796] [<ffffffff805e0632>] ? schedule+0x3e/0x6a9
[ 2368.256801] [<ffffffff80230be3>] ? pick_next_task_fair+0xa0/0xc2
[ 2368.256806] [<ffffffff8028d690>] ? trace_preempt_on+0x1c/0x32
[ 2368.256811] [<ffffffff805e6469>] ? sub_preempt_count+0x49/0x73
[ 2368.256817] [<ffffffff805e35ec>] ? _spin_lock_irqsave+0x35/0x69
[ 2368.256826] [<ffffffff8024afed>] ? __mod_timer+0xce/0xf4
[ 2368.256830] [<ffffffff8024ad0a>] ? del_timer_sync+0x28/0x4d
[ 2368.256836] [<ffffffff805e1152>] ? schedule_timeout+0xac/0xdc
[ 2368.256840] [<ffffffff8024a74b>] ? process_timeout+0x0/0x37
[ 2368.256844] [<ffffffff8024af4f>] ? __mod_timer+0x30/0xf4
[ 2368.256851] [<ffffffff80286ef0>] ? ftraced+0x52/0x1f0
[ 2368.256855] [<ffffffff80257bbf>] ? kthread+0x0/0xa4
[ 2368.256859] [<ffffffff80286e9e>] ? ftraced+0x0/0x1f0
[ 2368.256863] [<ffffffff80257c20>] ? kthread+0x61/0xa4
[ 2368.256868] [<ffffffff8020d7f8>] ? child_rip+0xa/0x12
[ 2368.256873] [<ffffffff8020cee0>] ? restore_args+0x0/0x30
[ 2368.256878] [<ffffffff80257bbf>] ? kthread+0x0/0xa4
[ 2368.256882] [<ffffffff8020d7ee>] ? child_rip+0x0/0x12

On Wed, Sep 9, 2009 at 10:04 PM, Jason Nymble<jason.nymble@xxxxxxxxx> wrote:
>
> On 09 Sep 2009, at 3:47 PM, Mulyadi Santosa wrote:
>
>> Hi Jason...
>>
>> On Wed, Sep 9, 2009 at 3:39 PM, Jason Nymble<jason.nymble@xxxxxxxxx>
>> wrote:
>>>
>>> Hi,
>>>
>>> Background: We use a custom kernel driver module for our PCIe device
>>> which
>>> processes bulk data between the host and the card. The card issues MSI
>>> interrupts at up to 20kHz to the host, and the driver interrupt routine
>>> essentially just calls tasklet_schedule() and returns IRQ_HANDLED, and
>>> the
>>> work is performed inside the tasklet routine. This has worked very well
>>> for
>>> us for the past several years, with acceptably low overhead on the
>>> processor
>>> servicing the interrupts and running the tasklet, using Linux kernel
>>> versions from about 2.6.13 to 2.6.24.
>>>
>>> Recent tests on kernels from 2.6.25 to 2.6.30 indicate some serious
>>> regression however. The CPU core servicing the interrupts/tasklets shows
>>> 100% si usage in top for ksoftirqd, and the driver can consequently only
>>> handle a very small fraction of what it was able to handle using kernel
>>> <=2.6.24 (slowdown of around 50-100x)... Even when we scale back our
>>> interrupt rate to 1kHz, we still see this poor behavior, and from what we
>>> can tell the time isn't actually spent in our tasklet code itself (not
>>> 100%
>>> sure of this).
>>
>> I guess it has something to do with CFS (Complete Fair Scheduler).
>> Your tasklet's time slice is "punished" so it can not dominate the
>> entire CPU time by itself, hence the name "fair scheduler".
>>
>> I agree that profiling using OProfile might reveal the time occupation
>> percentage on several kernel functions (including your driver's
>> codes).
>>
>> As the workaround, I guess it can be done by creating workqueue
>> threads as many as your CPU cores. Therefore, time slicing will be
>> splitted among these threads.
>>
>>
>
>
> Interesting, I also considered at one point whether it has something to do
> with CFS, and even contemplated maybe using the BFS patches. What bothers me
> though, why in kernels <= 2.6.24 did the thing use nowhere near 100% CPU,
> and now in >=2.6.25 it uses 100% CPU and gets MUCH less work done (50-100x
> less)... That doesn't sound to me to be a symptom of just being scheduled
> more fairly, unless there is a bug. (There is nothing else using up the CPU
> in this particular test scenario)
>
> --
> To unsubscribe from this list: send an email with
> "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
> Please read the FAQ at http://kernelnewbies.org/FAQ
>
>

-- 
Regards,
Peter Teoh

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ