On Mon, Feb 4, 2013 at 3:40 PM, Sven-Thorsten Dietrich <sven@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Mon, 2013-02-04 at 13:54 -0800, Austin Hendrix wrote: >> An example of what I'm seeing from latencytop is: >> [kfree] 7937.6 msec 6.4 % > > you are overwhelming folks with data now... but just a few more > questions: > > - e.g. what does the RT program you are running do? > - Code snippet that reproduces this issue? > - list of other processes on system. > - system specs; e.g. cpu speed, I am guessing around 8 - 12 Kilo Hz? > > Thanks > > Sven > > >> >> I poked around the kernel source a bit and was pretty stumped by this, >> so I'm glad it's not just me. >> >> Thanks, >> -Austin >> >> On Mon, Feb 4, 2013 at 3:49 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: >> > On Tue, 29 Jan 2013, Austin Hendrix wrote: >> >> I'm running 3.4.4-rt13 on my systems, and while the realtime >> >> performance is great, I occasionally see non-realtime processes block >> >> for several seconds. Running latencytop, it looks like the kfree >> >> kernel process is the worst offender. Does anyone have advice on how I >> > >> > There is no kfree process. kfree() is a function to release memory >> > allocated by kmalloc. >> > >> > Can you provide the latencytop output please ? >> > >> > Thanks, >> > >> > tglx >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > I wish I had more data too; that's one of the frustrating things about the troubleshooting process. Most of my attempts to capture more data or narrow the problem have instead caused the system to stop exhibiting problems. The system in question is a dual Xeon L5520 (quad-core, 2.27GHz), with 24GB of RAM. It's inside our robots: https://willowgarage.com/pages/pr2/overview The system load is around 2-3 under normal use. My realtime process is an EtherCAT master, using about 30-40% of one core. We use it for 1kHz motor control, so the realtime deadline is pretty lax; in the 100's of us range. The rest of the system load comes from a number of non-realtime processes that are doing a significant amount of network I/O, along with an NFS server. Believe it or not, this system actually works quite well on the 3.0.6-rt17 kernel. I'm upgrading it to a newer kernel since I'm also upgrading the base OS from Ubuntu Lucid to Precise. I gave the 3.4.28-rt40 stable release a try today, and so far I haven't seen the problems that I was seeing with 3.4.4-rt13. I'd still like to know more about how the debug the problems I'm seeing on 3.4.4-rt13 so that I can do a better job of debugging if problems like this come up in the future. Thanks, -Austin -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html