On Thu, Nov 24, 2011 at 10:07:55AM +0900, KAMEZAWA Hiroyuki wrote: > > > Can I make a question ? > > On Wed, 23 Nov 2011 14:34:16 +0100 > Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > + /* > > + * When allocating a page cache page for writing, we > > + * want to get it from a zone that is within its dirty > > + * limit, such that no single zone holds more than its > > + * proportional share of globally allowed dirty pages. > > + * The dirty limits take into account the zone's > > + * lowmem reserves and high watermark so that kswapd > > + * should be able to balance it without having to > > + * write pages from its LRU list. > > + * > > + * This may look like it could increase pressure on > > + * lower zones by failing allocations in higher zones > > + * before they are full. But the pages that do spill > > + * over are limited as the lower zones are protected > > + * by this very same mechanism. It should not become > > + * a practical burden to them. > > + * > > + * XXX: For now, allow allocations to potentially > > + * exceed the per-zone dirty limit in the slowpath > > + * (ALLOC_WMARK_LOW unset) before going into reclaim, > > + * which is important when on a NUMA setup the allowed > > + * zones are together not big enough to reach the > > + * global limit. The proper fix for these situations > > + * will require awareness of zones in the > > + * dirty-throttling and the flusher threads. > > + */ > > + if ((alloc_flags & ALLOC_WMARK_LOW) && > > + (gfp_mask & __GFP_WRITE) && !zone_dirty_ok(zone)) > > + goto this_zone_full; > > > > BUILD_BUG_ON(ALLOC_NO_WATERMARKS < NR_WMARK); > > if (!(alloc_flags & ALLOC_NO_WATERMARKS)) { > > This wil call > > if (NUMA_BUILD) > zlc_mark_zone_full(zonelist, z); > > And this zone will be marked as full. > > IIUC, zlc_clear_zones_full() is called only when direct reclaim ends. > So, if no one calls direct-reclaim, 'full' mark may never be cleared > even when number of dirty pages goes down to safe level ? > I'm sorry if this is alread discussed. It does not remember which zones are marked full for longer than a second - see zlc_setup() - and also ignores this information when an iteration over the zonelist with the cache enabled came up empty-handed. I thought it would make sense to take advantage of the cache and save the zone_dirty_ok() checks against ineligible zones too on subsequent iterations. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html