On Thu, Aug 19, 2010 at 04:08:31PM +0800, Wu Fengguang wrote: > On Thu, Aug 19, 2010 at 03:46:02PM +0800, Mel Gorman wrote: > > On Thu, Aug 19, 2010 at 04:09:38PM +0900, Iram Shahzad wrote: > > >> The loop should be waiting for the _other_ processes (doing direct > > >> reclaims) to proceed. When there are _lots of_ ongoing page > > >> allocations/reclaims, it makes sense to wait for them to calm down a bit? > > > > > > I have noticed that if I run other process, it helps the loop to exit. > > > So is this (ie hanging until other process helps) intended behaviour? > > > > > > > No, it's not but I'm not immediately seeing how it would occur either. > > too_many_isolated() should only be true when there are multiple > > processes running that are isolating pages be it due to reclaim or > > compaction. These should be finishing their work after some time so > > while a process may stall in too_many_isolated(), it should not stay > > there forever. > > > > The loop around isolate_migratepages() puts back LRU pages it failed to > > migrate so it's not the case that the compacting process is isolating a > > large number of pages and then calling too_many_isolated() against itself. > > It seems the compaction process isolates 128MB pages at a time? It should be one pageblock at a time for source migration and one pageblock for target pages. Look at the values for low_pfn and end_pfn here; static unsigned long isolate_migratepages(struct zone *zone, struct compact_control *cc) { unsigned long low_pfn, end_pfn; struct list_head *migratelist = &cc->migratepages; /* Do not scan outside zone boundaries */ low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn); /* Only scan within a pageblock boundary */ end_pfn = ALIGN(low_pfn + pageblock_nr_pages, pageblock_nr_pages); .... and the loop around that looks like while ((ret = compact_finished(zone, cc)) == COMPACT_CONTINUE) { unsigned long nr_migrate, nr_remaining; if (!isolate_migratepages(zone, cc)) continue; nr_migrate = cc->nr_migratepages; migrate_pages(&cc->migratepages, compaction_alloc, (unsigned long)cc, 0); update_nr_listpages(cc); nr_remaining = cc->nr_migratepages; count_vm_event(COMPACTBLOCKS); count_vm_events(COMPACTPAGES, nr_migrate - nr_remaining); if (nr_remaining) count_vm_events(COMPACTPAGEFAILED, nr_remaining); /* Release LRU pages not migrated */ if (!list_empty(&cc->migratepages)) { putback_lru_pages(&cc->migratepages); cc->nr_migratepages = 0; } } Where is it isolating 128MB? > That > sounds risky, too_many_isolated() can easily be true, which will stall > direct reclaim processes. I'm not seeing how exactly it makes > compaction itself stall infinitely though. > > > > Also, the other process does help the loop to exit, but again it enters > > > the loop and the compaction is never finished. That is, the process > > > looks like hanging. Is this intended behaviour? > > > > Infinite loops are never intended behaviour. > > Yup. > -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>