Re: Soft lock issue with 2.6.33.7-rt29

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/19/2010 10:46 AM, Darren Hart wrote:
On 11/18/2010 02:11 PM, Nathan Grennan wrote:
On 11/17/2010 05:26 PM, Darren Hart wrote:
On 11/17/2010 11:11 AM, Nathan Grennan wrote:
I have been working for weeks to get a stable rt kernel. I had been
focusing on 2.6.31.6-rt19. It is stable for about four days under stress
testing before it soft locks. I am using rt19 instead of rt21, because
rt19 seems to be more stable. The rtmutex issue that seems to still be
in rt29 is in rt21. I also had to backport the iptables fix to rt19.

I just started looking at 2.6.33.7-rt29 again, since I can reproduce a
soft lock with it in 10-15 minutes. I have yet to get sysrq output for
rt19, since it takes four days. The soft lock with rt29 as far as I can
tell seems to relate to disk i/o.

There are links to two logs of rt29 from a serial console below. They
include sysrq output like "Show Blocked State" and "Show State". The
level7 file is with nfsd enable, and level9 is with it disable. So nfsd
doesn't seem to be the issue.

If any other debugging information is useful or needed, just say the
word.

A reproducible test-case is always the first thing we ask for :-) What
is your stress test?

I have been able to boil it down the script below. If I just run yes it
is fine, if I just run dd, it is fine. If you just run octave, it is
fine. Run yes+dd, gets it most of the way there, but will wake up
sometimes, off and on. Do all three together and it soft locks. It takes
5-15 minutes. I did it on our main example hardware, which is a server.
I have also reproduced it on a desktop. Sometimes sysrq-n, to renice
realtime processes, brings it out of it enough you can kill processes off.


Interesting, so you're locking up a preempt-rt kernel with SCHED_OTHER tasks running at the least favorable priority.

Note: nice -n 19 is actually the valid nice value (20 and higher seem to be accepted, but have the same effect as 19). NICE(1)

How many CPUs on your test machine?

The server is dual quad-core. The desktop is a quad-core with hyperthreading. Both are i7 based.
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux