On Thu, Aug 05, 2010 at 03:45:24PM +0900, KOSAKI Motohiro wrote: > > sorry for the _very_ delayed review. > Not to worry. > > <SNIP> > > +/* > > + * When reclaim encounters dirty data, wakeup flusher threads to clean > > + * a maximum of 4M of data. > > + */ > > +#define MAX_WRITEBACK (4194304UL >> PAGE_SHIFT) > > +#define WRITEBACK_FACTOR (MAX_WRITEBACK / SWAP_CLUSTER_MAX) > > +static inline long nr_writeback_pages(unsigned long nr_dirty) > > +{ > > + return laptop_mode ? 0 : > > + min(MAX_WRITEBACK, (nr_dirty * WRITEBACK_FACTOR)); > > +} > > ?? > > As far as I remembered, Hannes pointed out wakeup_flusher_threads(0) is > incorrect. can you fix this? > It's behaving as it should, see http://lkml.org/lkml/2010/7/20/151 > > > > + > > static struct zone_reclaim_stat *get_reclaim_stat(struct zone *zone, > > struct scan_control *sc) > > { > > @@ -649,12 +661,14 @@ static noinline_for_stack void free_page_list(struct list_head *free_pages) > > static unsigned long shrink_page_list(struct list_head *page_list, > > struct scan_control *sc, > > enum pageout_io sync_writeback, > > + int file, > > unsigned long *nr_still_dirty) > > { > > LIST_HEAD(ret_pages); > > LIST_HEAD(free_pages); > > int pgactivate = 0; > > unsigned long nr_dirty = 0; > > + unsigned long nr_dirty_seen = 0; > > unsigned long nr_reclaimed = 0; > > > > cond_resched(); > > @@ -748,6 +762,8 @@ static unsigned long shrink_page_list(struct list_head *page_list, > > } > > > > if (PageDirty(page)) { > > + nr_dirty_seen++; > > + > > /* > > * Only kswapd can writeback filesystem pages to > > * avoid risk of stack overflow > > @@ -875,6 +891,18 @@ keep: > > > > list_splice(&ret_pages, page_list); > > > > + /* > > + * If reclaim is encountering dirty pages, it may be because > > + * dirty pages are reaching the end of the LRU even though the > > + * dirty_ratio may be satisified. In this case, wake flusher > > + * threads to pro-actively clean up to a maximum of > > + * 4 * SWAP_CLUSTER_MAX amount of data (usually 1/2MB) unless > > + * !may_writepage indicates that this is a direct reclaimer in > > + * laptop mode avoiding disk spin-ups > > + */ > > + if (file && nr_dirty_seen && sc->may_writepage) > > + wakeup_flusher_threads(nr_writeback_pages(nr_dirty)); > > Umm.. > I don't think this guessing is so acculate. following is brief of > current isolate_lru_pages(). > > > static unsigned long isolate_lru_pages(unsigned long nr_to_scan, > struct list_head *src, struct list_head *dst, > unsigned long *scanned, int order, int mode, int file) > { > for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) { > __isolate_lru_page(page, mode, file)) > > if (!order) > continue; > > /* > * Attempt to take all pages in the order aligned region > * surrounding the tag page. Only take those pages of > * the same active state as that tag page. We may safely > * round the target page pfn down to the requested order > * as the mem_map is guarenteed valid out to MAX_ORDER, > * where that page is in a different zone we will detect > * it from its zone id and abort this block scan. > */ > for (; pfn < end_pfn; pfn++) { > struct page *cursor_page; > (snip) > } > > (This was unchanged since initial lumpy reclaim commit) > I think what you are pointing out is that when lumpy-reclaiming from the anon LRU, there may be file pages on the page_list being shrinked. In that case, we might miss an opportunity to wake the flusher threads when it was appropriate. Is that accurate or have you another concern? > That said, merely order-1 isolate_lru_pages(ISOLATE_INACTIVE) makes pfn > neighbor search. then, we might found dirty pages even though the page > don't stay in end of lru. > > What do you think? > For low-order lumpy reclaim, I think it should only be necessary to wake the flusher threads when scanning the file LRU. While there may be file pages lumpy reclaimed while scanning the anon list, I think we would have to show it was a common and real problem before adding the necessary accounting and checks. > > > + > > *nr_still_dirty = nr_dirty; > > count_vm_events(PGACTIVATE, pgactivate); > > return nr_reclaimed; > > @@ -1315,7 +1343,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone, > > spin_unlock_irq(&zone->lru_lock); > > > > nr_reclaimed = shrink_page_list(&page_list, sc, PAGEOUT_IO_ASYNC, > > - &nr_dirty); > > + file, &nr_dirty); > > > > /* > > * If specific pages are needed such as with direct reclaiming > > @@ -1351,7 +1379,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct zone *zone, > > count_vm_events(PGDEACTIVATE, nr_active); > > > > nr_reclaimed += shrink_page_list(&page_list, sc, > > - PAGEOUT_IO_SYNC, &nr_dirty); > > + PAGEOUT_IO_SYNC, file, > > + &nr_dirty); > > } > > } > > > > -- > > 1.7.1 > > > > > -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html