Thomas,
Please find below experiment results of disabling CONFIG_NOHZ.
Iratxo Pichel Ortiz wrote:
Please find below more detailed infor regarding this "NOHZ:
local_softirq_pending".
Iratxo Pichel Ortiz wrote:
Thomas,
More info below, I hope it helps.
Iratxo Pichel Ortiz wrote:
Thomas,
Iratxo Pichel Ortiz wrote:
Do you know what could be causing this issue. I have managed to
repeat this
traces (NOHZ...) without using my code, using a workqueue and in
the work
just by doing something like:
work_func() {
mdelay(10);
msleep(10);
queue_work(myqueue, mywork);
}
And then by heavy loading the box from the outside.
I have written a very small module that causes the
"local_softirq_pending" under not some load. Please find code at the
end
of this email. Here is pasted some traces of dmesg (I have increased
the
ratelimit of the "NOHZ: local..." trace to 250.
The only strange thing here is that I am calling
"set_workqueue_prio" (I
have hacked source to export this symbol), and I am starting to think
that this could not be a good idea. Any hints about this?
[ 648.954000] NOHZ: local_softirq_pending
0e
[ 648.955000] NOHZ: local_softirq_pending
0e
[ 648.956000] NOHZ: local_softirq_pending
0e
I have changed the implementation of the module test to use kthreads
instead of workqueues. The behavior is exactly the same. I have tried
with prios from 1 to 99. Please find the code below as before. I have
also atached the differente softirqs codes that had been pending in some
of the tests.
I have even tried this without any system-loader module. Just by
booting the kernel and pinging the box very heavily, there are a lot
of NOHZ... traces in dmesg. Indeed they follow a very strange pattern
that I cannot match without any part of the kernel. The pattern is the
following (NOHZ and HZ=1000):
[...]
So it seems that my "RT" tasks is delayed, as you said in your
original mail, when the 02 SIRQ is delayed, but the rest of the time
is correctly running. This problem appears to be a Kernel or RT patch
issue, so please let me know which tests would you like me to do, I
have a couple of boxes here and some time to build and test kernels.
Alternatively, if you would like me to look at any part of the system,
let me know and I will try my best.
Does it work when you disable CONFIG_NOHZ ?
Still pending to test.
I have tried disabling the CONFIG_NOHZ kernel option. Of course the
trace is gone, but the weird behavior is still there. When I run my
software without load from the network, the main task of the system
experiences runtimes of about 700us. When I load the system, there are
latencies of 50700us, so the 50ms delay is again there, and again the
time when the task finishes is always X.296, 1 jiffy after the "NOHZ:
pending..." was shown with CONFIG_NOHZ enabled.
Please let me know what tests would you like me to do.
I will try this and let the list know.
Thanks,
tglx
Thanks,
Iratxo.
Thanks,
Iratxo.
--
Iratxo Pichel Ortiz
Software Development Manager
Albentia Systems S.A.
http://www.albentia.com
Tel: +34 914400567
Cel: +34 663808405
Fax: +34 914400569
C\Margarita Salas 22
Parque Tecnológico de Leganés
Leganés (28918)
Madrid
Spain
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html