On Sat, May 21, 2011 at 8:04 AM, KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote: >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 3f44b81..d1dabc9 100644 >> @@ -1426,8 +1437,13 @@ shrink_inactive_list(unsigned long nr_to_scan, >> struct zone *zone, >> >> /* Check if we should syncronously wait for writeback */ >> if (should_reclaim_stall(nr_taken, nr_reclaimed, priority, sc)) { >> + unsigned long nr_active, old_nr_scanned; >> set_reclaim_mode(priority, sc, true); >> + nr_active = clear_active_flags(&page_list, NULL); >> + count_vm_events(PGDEACTIVATE, nr_active); >> + old_nr_scanned = sc->nr_scanned; >> nr_reclaimed += shrink_page_list(&page_list, zone, sc); >> + sc->nr_scanned = old_nr_scanned; >> } >> >> local_irq_disable(); >> >> I just tested 2.6.38.6 with the attached patch. It survived dirty_ram >> and test_mempressure without any problems other than slowness, but >> when I hit ctrl-c to stop test_mempressure, I got the attached oom. > > Minchan, > > I'm confused now. > If pages got SetPageActive(), should_reclaim_stall() should never return true. > Can you please explain which bad scenario was happen? > > ----------------------------------------------------------------------------------------------------- > static void reset_reclaim_mode(struct scan_control *sc) > { > sc->reclaim_mode = RECLAIM_MODE_SINGLE | RECLAIM_MODE_ASYNC; > } > > shrink_page_list() > { > (snip) > activate_locked: > SetPageActive(page); > pgactivate++; > unlock_page(page); > reset_reclaim_mode(sc); /// here > list_add(&page->lru, &ret_pages); > } > ----------------------------------------------------------------------------------------------------- > > > ----------------------------------------------------------------------------------------------------- > bool should_reclaim_stall() > { > (snip) > > /* Only stall on lumpy reclaim */ > if (sc->reclaim_mode & RECLAIM_MODE_SINGLE) /// and here > return false; > ----------------------------------------------------------------------------------------------------- > I did some tracing and the oops happens from the second call to shrink_page_list after should_reclaim_stall returns true and it hits the same pages in the same order that the earlier call just finished calling SetPageActive on. I have *not* confirmed that the two calls happened from the same call to shrink_inactive_list, but something's certainly wrong in there. This is very easy to reproduce on my laptop. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href