On Thu, Aug 19, 2010 at 04:15:54PM +0800, Mel Gorman wrote: > On Thu, Aug 19, 2010 at 04:08:31PM +0800, Wu Fengguang wrote: > > On Thu, Aug 19, 2010 at 03:46:02PM +0800, Mel Gorman wrote: > > > On Thu, Aug 19, 2010 at 04:09:38PM +0900, Iram Shahzad wrote: > > > >> The loop should be waiting for the _other_ processes (doing direct > > > >> reclaims) to proceed. When there are _lots of_ ongoing page > > > >> allocations/reclaims, it makes sense to wait for them to calm down a bit? > > > > > > > > I have noticed that if I run other process, it helps the loop to exit. > > > > So is this (ie hanging until other process helps) intended behaviour? > > > > > > > > > > No, it's not but I'm not immediately seeing how it would occur either. > > > too_many_isolated() should only be true when there are multiple > > > processes running that are isolating pages be it due to reclaim or > > > compaction. These should be finishing their work after some time so > > > while a process may stall in too_many_isolated(), it should not stay > > > there forever. > > > > > > The loop around isolate_migratepages() puts back LRU pages it failed to > > > migrate so it's not the case that the compacting process is isolating a > > > large number of pages and then calling too_many_isolated() against itself. > > > > It seems the compaction process isolates 128MB pages at a time? > > It should be one pageblock at a time for source migration and one pageblock > for target pages. Look at the values for low_pfn and end_pfn here; Ah sorry! I confused it with section size.. Thanks, Fengguang > static unsigned long isolate_migratepages(struct zone *zone, > struct compact_control *cc) > { > unsigned long low_pfn, end_pfn; > struct list_head *migratelist = &cc->migratepages; > > /* Do not scan outside zone boundaries */ > low_pfn = max(cc->migrate_pfn, zone->zone_start_pfn); > > /* Only scan within a pageblock boundary */ > end_pfn = ALIGN(low_pfn + pageblock_nr_pages, pageblock_nr_pages); > > .... > > and the loop around that looks like > > while ((ret = compact_finished(zone, cc)) == COMPACT_CONTINUE) { > unsigned long nr_migrate, nr_remaining; > > if (!isolate_migratepages(zone, cc)) > continue; > > nr_migrate = cc->nr_migratepages; > migrate_pages(&cc->migratepages, compaction_alloc, > (unsigned long)cc, 0); > update_nr_listpages(cc); > nr_remaining = cc->nr_migratepages; > > count_vm_event(COMPACTBLOCKS); > count_vm_events(COMPACTPAGES, nr_migrate - nr_remaining); > if (nr_remaining) > count_vm_events(COMPACTPAGEFAILED, nr_remaining); > > /* Release LRU pages not migrated */ > if (!list_empty(&cc->migratepages)) { > putback_lru_pages(&cc->migratepages); > cc->nr_migratepages = 0; > } > > } > > Where is it isolating 128MB? > > > That > > sounds risky, too_many_isolated() can easily be true, which will stall > > direct reclaim processes. I'm not seeing how exactly it makes > > compaction itself stall infinitely though. > > > > > > Also, the other process does help the loop to exit, but again it enters > > > > the loop and the compaction is never finished. That is, the process > > > > looks like hanging. Is this intended behaviour? > > > > > > Infinite loops are never intended behaviour. > > > > Yup. > > > > -- > Mel Gorman > Part-time Phd Student Linux Technology Center > University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>