On Mon, May 23, 2011 at 9:34 PM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote: > On Tue, May 24, 2011 at 10:19 AM, Andrew Lutomirski <luto@xxxxxxx> wrote: >> On Sun, May 22, 2011 at 7:12 PM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote: >>> Could you test below patch based on vanilla 2.6.38.6? >>> The expect result is that system hang never should happen. >>> I hope this is last test about hang. >>> >>> Thanks. >>> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>> index 292582c..1663d24 100644 >>> --- a/mm/vmscan.c >>> +++ b/mm/vmscan.c >>> @@ -231,8 +231,11 @@ unsigned long shrink_slab(struct shrink_control *shrink, >>> if (scanned == 0) >>> scanned = SWAP_CLUSTER_MAX; >>> >>> - if (!down_read_trylock(&shrinker_rwsem)) >>> - return 1; /* Assume we'll be able to shrink next time */ >>> + if (!down_read_trylock(&shrinker_rwsem)) { >>> + /* Assume we'll be able to shrink next time */ >>> + ret = 1; >>> + goto out; >>> + } >>> >>> list_for_each_entry(shrinker, &shrinker_list, list) { >>> unsigned long long delta; >>> @@ -286,6 +289,8 @@ unsigned long shrink_slab(struct shrink_control *shrink, >>> shrinker->nr += total_scan; >>> } >>> up_read(&shrinker_rwsem); >>> +out: >>> + cond_resched(); >>> return ret; >>> } >>> >>> @@ -2331,7 +2336,7 @@ static bool sleeping_prematurely(pg_data_t >>> *pgdat, int order, long remaining, >>> * must be balanced >>> */ >>> if (order) >>> - return pgdat_balanced(pgdat, balanced, classzone_idx); >>> + return !pgdat_balanced(pgdat, balanced, classzone_idx); >>> else >>> return !all_zones_ok; >>> } >> >> So far with this patch I can't reproduce the hang or the bogus OOM. >> >> To be completely clear, I have COMPACTION, MIGRATION, and THP off, I'm >> running 2.6.38.6, and I have exactly two patches applied. One is the >> attached patch and the other is a the fpu.ko/aesni_intel.ko merger >> which I need to get dracut to boot my box. >> >> For fun, I also upgraded to 8GB of RAM and it still works. >> > > Hmm. Could you test it with enable thp and 2G RAM? > Isn't it a original test environment? > Please don't change test environment. :) The test that passed last night was an environment (hardware and config) that I had confirmed earlier as failing without the patch. I just re-tested my original config (from a backup -- migration, compaction, and thp "always" are enabled). I get bogus OOMs but not a hang. (I'm running with mem=2G right now -- I'll swap the DIMMs back out later on if you want.) I attached the bogus OOM (actually several that happened in sequence). They look readahead-related. There was plenty of free swap space. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href