On Tue, 9 Feb 2016 17:42:56 -0500 Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > On Tue, Feb 09, 2016 at 05:52:40PM +0100, Andres Freund wrote: > > Rik asked me about active/inactive sizing in /proc/meminfo: > > Active: 7860556 kB > > Inactive: 5395644 kB > > Active(anon): 2874936 kB > > Inactive(anon): 432308 kB > > Active(file): 4985620 kB > > Inactive(file): 4963336 kB > Yes, a generous minimum size of the inactive list made sense when it > was the exclusive staging area to tell use-once pages from use-many > pages. Now that we have refault information to detect use-many with > arbitrary inactive list size, this minimum is no longer reasonable. > > The new minimum should be smaller, but big enough for applications to > actually use the data in their pages between fault and eviction > (i.e. it needs to take the aggregate readahead window into account), > and big enough for active pages that are speculatively challenged > during workingset changes to get re-activated without incurring IO. > > However, I don't think it makes sense to dynamically adjust the > balance between the active and the inactive cache during refaults. Johannes, does this patch look ok to you? Andres, does this patch work for you? -----8<----- Subject: mm,vmscan: reduce size of inactive file list The inactive file list should still be large enough to contain readahead windows and freshly written file data, but it no longer is the only source for detecting multiple accesses to file pages. The workingset refault measurement code causes recently evicted file pages that get accessed again after a shorter interval to be promoted directly to the active list. With that mechanism in place, we can afford to (on a larger system) dedicate more memory to the active file list, so we can actually cache more of the frequently used file pages in memory, and not have them pushed out by streaming writes, once-used streaming file reads, etc. This can help things like database workloads, where only half the page cache can currently be used to cache the database working set. This patch automatically increases that fraction on larger systems, using the same ratio that has already been used for anonymous memory. Signed-off-by: Rik van Riel <riel@xxxxxxxxxx> Reported-by: Andres Freund <andres@xxxxxxxxxxx> --- mm/vmscan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index eb3dd37ccd7c..0a316c41bf80 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1928,13 +1928,14 @@ static inline bool inactive_anon_is_low(struct lruvec *lruvec) */ static bool inactive_file_is_low(struct lruvec *lruvec) { + struct zone *zone = lruvec_zone(lruvec); unsigned long inactive; unsigned long active; inactive = get_lru_size(lruvec, LRU_INACTIVE_FILE); active = get_lru_size(lruvec, LRU_ACTIVE_FILE); - return active > inactive; + return inactive * zone->inactive_ratio < active; } static bool inactive_list_is_low(struct lruvec *lruvec, enum lru_list lru) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>