Re: kfree causing high latency on 3.4.4-rt13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 4, 2013 at 3:40 PM, Sven-Thorsten Dietrich
<sven@xxxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 2013-02-04 at 13:54 -0800, Austin Hendrix wrote:
>> An example of what I'm seeing from latencytop is:
>> [kfree]                                           7937.6 msec          6.4 %
>
> you are overwhelming folks with data now... but just a few more
> questions:
>
> - e.g. what does the RT program you are running do?
> - Code snippet that reproduces this issue?
> - list of other processes on system.
> - system specs; e.g. cpu speed, I am guessing around 8 - 12 Kilo Hz?
>
> Thanks
>
> Sven
>
>
>>
>> I poked around the kernel source a bit and was pretty stumped by this,
>> so I'm glad it's not just me.
>>
>> Thanks,
>> -Austin
>>
>> On Mon, Feb 4, 2013 at 3:49 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> > On Tue, 29 Jan 2013, Austin Hendrix wrote:
>> >> I'm running 3.4.4-rt13 on my systems, and while the realtime
>> >> performance is great, I occasionally see non-realtime processes block
>> >> for several seconds. Running latencytop, it looks like the kfree
>> >> kernel process is the worst offender. Does anyone have advice on how I
>> >
>> > There is no kfree process. kfree() is a function to release memory
>> > allocated by kmalloc.
>> >
>> > Can you provide the latencytop output please ?
>> >
>> > Thanks,
>> >
>> >         tglx
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>

I wish I had more data too; that's one of the frustrating things about
the troubleshooting process. Most of my attempts to capture more data
or narrow the problem have instead caused the system to stop
exhibiting problems.

The system in question is a dual Xeon L5520 (quad-core, 2.27GHz), with
24GB of RAM. It's inside our robots:
https://willowgarage.com/pages/pr2/overview

The system load is around 2-3 under normal use.
My realtime process is an EtherCAT master, using about 30-40% of one
core. We use it for 1kHz motor control, so the realtime deadline is
pretty lax; in the 100's of us range.
The rest of the system load comes from a number of non-realtime
processes that are doing a significant amount of network I/O, along
with an NFS server.

Believe it or not, this system actually works quite well on the
3.0.6-rt17 kernel. I'm upgrading it to a newer kernel since I'm also
upgrading the base OS from Ubuntu Lucid to Precise.

I gave the 3.4.28-rt40 stable release a try today, and so far I
haven't seen the problems that I was seeing with 3.4.4-rt13. I'd still
like to know more about how the debug the problems I'm seeing on
3.4.4-rt13 so that I can do a better job of debugging if problems like
this come up in the future.

Thanks,
-Austin
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux