On Fri, May 13, 2011 at 7:58 AM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote: > On Fri, May 13, 2011 at 7:15 AM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote: >> On Thu, May 12, 2011 at 05:04:41PM -0500, James Bottomley wrote: >>> On Thu, 2011-05-12 at 15:04 -0500, James Bottomley wrote: >>> > Confirmed, I'm afraid ... I can trigger the problem with all three >>> > patches under PREEMPT. ÂIt's not a hang this time, it's just kswapd >>> > taking 100% system time on 1 CPU and it won't calm down after I unload >>> > the system. >>> >>> Just on a "if you don't know what's wrong poke about and see" basis, I >>> sliced out all the complex logic in sleeping_prematurely() and, as far >>> as I can tell, it cures the problem behaviour. ÂI've loaded up the >>> system, and taken the tar load generator through three runs without >>> producing a spinning kswapd (this is PREEMPT). ÂI'll try with a >>> non-PREEMPT kernel shortly. >>> >>> What this seems to say is that there's a problem with the complex logic >>> in sleeping_prematurely(). ÂI'm pretty sure hacking up >>> sleeping_prematurely() just to dump all the calculations is the wrong >>> thing to do, but perhaps someone can see what the right thing is ... >> >> I think I see the problem: the boolean logic of sleeping_prematurely() >> is odd. ÂIf it returns true, kswapd will keep running. ÂSo if >> pgdat_balanced() returns true, kswapd should go to sleep. >> >> This? > > Yes. Good catch. In addition, I see some strange thing. The comment in pgdat_balanced says "Only zones that meet watermarks and are in a zone allowed by the callers classzone_idx are added to balanced_pages" It's true in case of balance_pgdat but it's not true in sleeping_prematurely. This? barrios@barrios-desktop:~/linux-mmotm$ git diff mm/vmscan.c diff --git a/mm/vmscan.c b/mm/vmscan.c index 292582c..d9078cf 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2322,7 +2322,8 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining, classzone_idx, 0)) all_zones_ok = false; else - balanced += zone->present_pages; + if (i <= classzone_idx) + balanced += zone->present_pages; } /* @@ -2331,7 +2332,7 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining, * must be balanced */ if (order) - return pgdat_balanced(pgdat, balanced, classzone_idx); + return !pgdat_balanced(pgdat, balanced, classzone_idx); else return !all_zones_ok; } -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html