Re: [patch 00/10 -mm v3] oom killer rewrite

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 10 Mar 2010 02:41:08 -0800 (PST)
David Rientjes <rientjes@xxxxxxxxxx> wrote:

> This patchset is a rewrite of the out of memory killer to address several
> issues that have been raised recently.  The most notable change is a
> complete rewrite of the badness heuristic that determines which task is
> killed; the goal was to make it as simple and predictable as possible
> while still addressing issues that plague the VM.
> 
> Changes from version 2:
> 
>  - updated to mmotm-2010-03-09-19-15
> 
>  - schedule a timeout for current if it was not selected for oom kill
>    when it has returned VM_FAULT_OOM so memory can freed to prevent
>    needlessly recalling the oom killer and looping.
> 

To me, this seems to work nicer than current oom-killer, memory eater dies 1st. 
thanks.

BTW, it seems there are still chances for serial-oom-killer.

Assume I run memory eater (called malloc) on a host.
==
Mar 13 13:05:56 localhost kernel: malloc invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
Mar 13 13:05:56 localhost kernel: malloc cpuset=/ mems_allowed=0
Mar 13 13:05:56 localhost kernel: Pid: 2525, comm: malloc Not tainted 2.6.34-rc1-mm1+ #3
Mar 13 13:05:56 localhost kernel: Call Trace:
Mar 13 13:05:56 localhost kernel: [<ffffffff8108aebf>] ? cpuset_print_task_mems_allowed+0x91/0x9c
Mar 13 13:05:56 localhost kernel: [<ffffffff810c90c1>] dump_header+0x74/0x1af
<snip>
Mar 13 13:05:56 localhost kernel: [ 2525]   500  2525   434340   433346   0       0             0 malloc
Mar 13 13:05:56 localhost kernel: Out of memory: Kill process 2525 (malloc) with score 967 or sacrifice child
Mar 13 13:05:56 localhost kernel: Killed process 2525 (malloc) total-vm:1737360kB, anon-rss:1733364kB, file-rss:20kB
Mar 13 13:05:56 localhost kernel: rsyslogd invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
Mar 13 13:05:56 localhost kernel: rsyslogd cpuset=/ mems_allowed=0
Mar 13 13:05:56 localhost kernel: Pid: 696, comm: rsyslogd Not tainted 2.6.34-rc1-mm1+ #3
Mar 13 13:05:56 localhost kernel: Call Trace:
Mar 13 13:05:56 localhost kernel: [<ffffffff8108aebf>] ? cpuset_print_task_mems_allowed+0x91/0x9c
Mar 13 13:05:56 localhost kernel: [<ffffffff810c90c1>] dump_header+0x74/0x1af
Mar 13 13:05:56 localhost kernel: [<ffffffff81211a8e>] ? ___ratelimit+0xe6/0x104
Mar 13 13:05:56 localhost kernel: [<ffffffff810c942a>] oom_kill_process+0x49/0x1ed
<snip>
Mar 13 13:05:56 localhost kernel: 480 total pagecache pages
Mar 13 13:05:56 localhost kernel: 0 pages in swap cache
Mar 13 13:05:56 localhost kernel: Swap cache stats: add 0, delete 0, find 0/0
Mar 13 13:05:56 localhost kernel: Free swap  = 0kB
Mar 13 13:05:56 localhost kernel: Total swap = 0kB
Mar 13 13:05:56 localhost kernel: 2097151 pages RAM
Mar 13 13:05:56 localhost kernel: 48776 pages reserved
Mar 13 13:05:56 localhost kernel: 1356 pages shared
Mar 13 13:05:56 localhost kernel: 458132 pages non-shared
<snip>
Mar 13 13:05:56 localhost kernel: [ 2506]     0  2506     3120       55   0       0             0 anacron
Mar 13 13:05:56 localhost kernel: Out of memory: Kill process 1267 (gdm-simple-gree) with score 2 or sacrifice child
Mar 13 13:05:56 localhost kernel: Killed process 1267 (gdm-simple-gree) total-vm:359156kB, anon-rss:4012kB, file-rss:472kB
==

Then, at first, malloc, a bad program is killed. But, another oom-kill happens immediately and 
gdm-simple-gree is killed.

I think there is a task as !p->mm but TIF_MEMDIE task in tasklist.

Because exit_mm()'s logic is as following.
	mm = task->mm
	task->mm = NULL;
	mmput(mm)
		-> free pages under this mm.

This patch make the result better on my box, no serial killer, at least.

-Kame
==

---
 mm/oom_kill.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

Index: mmotm-2.6.34-Mar11/mm/oom_kill.c
===================================================================
--- mmotm-2.6.34-Mar11.orig/mm/oom_kill.c
+++ mmotm-2.6.34-Mar11/mm/oom_kill.c
@@ -290,6 +290,17 @@ static struct task_struct *select_bad_pr
 	for_each_process(p) {
 		unsigned int points;
 		/*
+		 * This task already has access to memory reserves and is
+		 * being killed. Don't allow any other task access to the
+		 * memory reserve.
+		 *
+		 * Note: this may have a chance of deadlock if it gets
+		 * blocked waiting for another task which itself is waiting
+		 * for memory. Is there a better alternative?
+		 */
+		if (test_tsk_thread_flag(p, TIF_MEMDIE))
+			return ERR_PTR(-1UL);
+		/*
 		 * skip kernel threads and tasks which have already released
 		 * their mm.
 		 */
@@ -305,17 +316,6 @@ static struct task_struct *select_bad_pr
 									 NULL))
 			continue;
 
-		/*
-		 * This task already has access to memory reserves and is
-		 * being killed. Don't allow any other task access to the
-		 * memory reserve.
-		 *
-		 * Note: this may have a chance of deadlock if it gets
-		 * blocked waiting for another task which itself is waiting
-		 * for memory. Is there a better alternative?
-		 */
-		if (test_tsk_thread_flag(p, TIF_MEMDIE))
-			return ERR_PTR(-1UL);
 
 		/*
 		 * This is in the process of releasing memory so wait for it


Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]