On Tue, Jun 22, 2010 at 11:24 AM, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote: >> > ============================================================= >> > Subject: [PATCH] Call cond_resched() at bottom of main look in balance_pgdat() >> > From: Larry Woodman <lwoodman@xxxxxxxxxx> >> > >> > We are seeing a problem where kswapd gets stuck and hogs the CPU on a >> > small single CPU system when an OOM kill should occur. When this >> > happens swap space has been exhausted and the pagecache has been shrunk >> > to zero. Once kswapd gets the CPU it never gives it up because at least >> > one zone is below high. Adding a single cond_resched() at the end of >> > the main loop in balance_pgdat() fixes the problem by allowing the >> > watchdog and tasks to run and eventually do an OOM kill which frees up >> > the resources. >> > >> > kosaki note: This seems regression caused by commit bb3ab59683 >> > (vmscan: stop kswapd waiting on congestion when the min watermark is >> > not being met) >> > >> > Signed-off-by: Larry Woodman <lwoodman@xxxxxxxxxx> >> > Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> >> > --- >> > mm/vmscan.c | 1 + >> > 1 files changed, 1 insertions(+), 0 deletions(-) >> > >> > diff --git a/mm/vmscan.c b/mm/vmscan.c >> > index 9c7e57c..c5c46b7 100644 >> > --- a/mm/vmscan.c >> > +++ b/mm/vmscan.c >> > @@ -2182,6 +2182,7 @@ loop_again: >> > */ >> > if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) >> > break; >> > + cond_resched(); >> > } >> > out: >> > /* >> > -- >> > 1.6.5.2 >> >> Kosaki's patch's goal is that kswap doesn't yield cpu if the zone doesn't meet its >> min watermark to avoid failing atomic allocation. >> But this patch could yield kswapd's time slice at any time. >> Doesn't the patch break your goal in bb3ab59683? > > No. it don't break. > > Typically, kswapd periodically call shrink_page_list() and it call > cond_resched() even if bb3ab59683 case. Hmm. If it is, bb3ab59683 is effective really? The bb3ab59683's goal is prevent CPU yield in case of free < min_watermark. But shrink_page_list can yield cpu from kswapd at any time. So I am not sure what is bb3ab59683's benefit. Did you have any number about bb3ab59683's effectiveness? (Of course, I know it's very hard. Just out of curiosity) As a matter of fact, when I saw this Larry's patch, I thought it would be better to revert bb3ab59683. Then congestion_wait could yield CPU to other process. What do you think about? -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href