Re: swap storms on low memory i386, tasks blocked on i386 and amd64 for kernel > 2.6.36-git6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You can turn off SWAP and try ?
Are you using SWAP partition or file ?

Please search for following "phrase" in mailing list ... could this be
a issue in new release ?
"2.6.36 io bring the system to its knees"

__
Tharindu R Bamunuarachchi.




On Sun, Oct 31, 2010 at 5:50 PM, Mulyadi Santosa
<mulyadi.santosa@xxxxxxxxx> wrote:
> Hi Arthur..
>
> Interesting to read your (1st? really?) post, but I am not sure how
> far I can help...so I'll just take a shot...
>
> On Sun, Oct 31, 2010 at 18:36, Arthur Marsh
> <arthur.marsh@xxxxxxxxxxxxxxxx> wrote:
>> I have a PII-266 with 384 MiB RAM (maximum capacity for the machine) and
>> an AMD64 dual core with 4 GiB RAM, both running Debian unstable plus
>> some packages from experimental and custom kernels.
>>
>> A typical load for the PII-266 is KDE 3.5.10 with konversation, icedove,
>> iceweasel, xmms, lynx, hp-systray, top and aptitude-curses.
>>
>> This works with stock Debian kernels and custom kernels up to and
>> including 2.6.36-git6. Under heavy load, free RAM will hover around 5
>> MiB but audio will still play with a very occasional skip and all
>> applications are responsive.
>
> that's free RAM..how about the buffers and page cache size?
>
> And how big is your swap size? And is it composed as single partition
> only in one disk? or...?
>
>> With the newer kernels, e.g. 2.6.36-git9,10,11 (all with the deadline
>> scheduler) get into a swap storm with over 32 MiB RAM free and kswapd0
>> taking 10 percent or more of CPU time. Kernel 2.6.36-git15 with the cfq
>> scheduler managed to keep smaller applications like xmms and shells
>> running but also had excessive free RAM and kswapd0 taking more than 10
>> percent of CPU time.
>
> have you tried to play with /proc/sys/vm/swappiness?
>
>> On the AMD64 dual core when compiling kernels with CONCURRENCY_LEVEL=4,
>> I would sometimes get the build process pausing with errors like:
>>
>> [ 2880.492025] INFO: task sh:10071 blocked for more than 120 seconds.
>> [ 2880.493165] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [ 2880.494299] sh      ÂD ffff8800cfc53780   0 10071  9092
>> 0x00000000
>> [ 2880.495433] Âffff880128b214a0 0000000000000086 ffff880128851b00
>> ffff88012b67cba0
>> [ 2880.496612] Â0000000000013780 0000000000013780 ffff88011969dfd8
>> ffff88011969c000
>> [ 2880.497765] Âffff880128b21798 ffff880128b217a0 ffff880128b214a0
>> ffff88011969dfd8
>> [ 2880.498925] Call Trace:
>> [ 2880.500096] Â[<ffffffff810086fc>] ? __switch_to+0x198/0x284
>> [ 2880.501262] Â[<ffffffff8132d36b>] ? schedule_timeout+0x2d/0xd7
>> [ 2880.502418] Â[<ffffffff8103999f>] ? need_resched+0x1a/0x23
>> [ 2880.503564] Â[<ffffffff8132d13f>] ? schedule+0x5b2/0x5c9
>> [ 2880.504739] Â[<ffffffff8103e36d>] ? get_parent_ip+0x9/0x1b
>> [ 2880.505883] Â[<ffffffff8132c9c6>] ? wait_for_common+0x9d/0x10c
>> [ 2880.507024] Â[<ffffffff81043d10>] ? default_wake_function+0x0/0xf
>> [ 2880.508189] Â[<ffffffff81089999>] ? stop_one_cpu+0x57/0x6e
>> [ 2880.509288] Â[<ffffffff81043b23>] ? migration_cpu_stop+0x0/0x2a
>> [ 2880.510354] Â[<ffffffff81331951>] ? sub_preempt_count+0x83/0x94
>> [ 2880.511426] Â[<ffffffff8103f880>] ? sched_exec+0xbe/0xd6
>> [ 2880.512523] Â[<ffffffff810fea4b>] ? do_execve+0xd1/0x28e
>> [ 2880.513582] Â[<ffffffff81010786>] ? sys_execve+0x3f/0x54
>> [ 2880.514616] Â[<ffffffff81009fdc>] ? stub_execve+0x6c/0xc0
>> amarsh04@am64:~$ uname -a
>> Linux am64 2.6.36-git16 #1 SMP PREEMPT Sun Oct 31 15:41:03 CST 2010
>> x86_64 GNU/Linux
>>
>> The process was unblocked by logging another session into the AMD64 machine.
>>
>> The "task foo blocked for more than 120 seconds" has also occurred on
>> the PII-266 uniprocessor machine with some of the 2.6.36-git9 or later
>> kernels. Previously I had not had such a problem on the PII-266 for
>> about 6 months.
>>
>> I can't begin to figure out where this problem was introduced.
>> Git-bisection doesn't always work as it may take a while for the
>> symptoms of a swap storm to appear.
>>
>> Is there any straightforward way to gather more information before
>> reporting this kind of issue upstream?
>
> Hmmm, I just sense, could be that the scheduler hackers once again
> play with child-forked-then-run-first? Once I read it is child which
> runs first, but lately I read it is the parent first...
>
> I also think, it might be something related to a patch Fengguang Wu
> made few weeks ago (IIRC) to reduce latency during high I/O... but I
> can't recall which patch that is. Try to google...
>
> NB: ftrace might help here.... please kindly read the ftrace tutorial
> posted in lwn.net in few latest editions...
>
>
> --
> regards,
>
> Mulyadi Santosa
> Freelance Linux trainer and consultant
>
> blog: the-hydra.blogspot.com
> training: mulyaditraining.blogspot.com
>
> --
> To unsubscribe from this list: send an email with
> "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
> Please read the FAQ at http://kernelnewbies.org/FAQ
>
>

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ




[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux