Hi Dong,
I can quite reliably trigger this cpu stall error now. Just try to start
several KVM guests.
Good. BTW, we do repeated long-term tests 14 times per day with a single
kvm guest that runs on two cores and conducts a number of CPU
benchmarks. (https://www.osadl.org/?id=931) - never had this problem. So
it may be related to running more than a single kvm guest.
[..]
Are there any way I can use to narrow down this error?
cd /sys/kernel/debug/tracing/
echo 0 >tracing_on
echo 1 >events/enable
echo function >current_tracer
echo 14080 >buffer_size_kb
echo 1 >tracing_on
while true
do
if dmesg | tail -100 | grep -q "rcu_preempt detected stalls"
then
echo 0 >tracing_on
break
fi
sleep 1
done
Then start the kvm quests.
Alternatively, you may use the kernel parameter ftrace_dump_on_oops.
If the problem no longer occurs or behaves differently, try to reduce
the debug output step be step, e.g. disable less important events and
specify selected available_filter_functions in set_ftrace_filter.
When the problem can be reproduced and the system stalls the way you
observed earlier, enter
cat trace >/tmp/trace.txt
and try to find out what is going on. If you need help, compress the trace
bzip2 trace.txt
upload trace.txt.bz2 to the Internet for inspection and post the related
URL.
-Carsten.
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html