Even before we added MemAvailable, users knew that page cache is easily convertible to free memory on pressure, and estimated their "available" memory by looking at the sum of MemFree, Cached, Buffers. However, "Cached" is calculated using NR_FILE_PAGES, which includes shmem and random driver pages inserted into the page tables; neither of which are easily reclaimable, or reclaimable at all. Reclaiming shmem requires swapping, which is slow. And unlike page cache, which has fairly conservative dirty limits, all of shmem needs to be written out before becoming evictable. Without swap, shmem is not evictable at all. And driver pages certainly never are. Calling these pages "Cached" is misleading and has resulted in broken formulas in userspace. They misrepresent the memory situation and cause either waste or unexpected OOM kills. With 64-bit and per-cpu memory we are way past the point where the relationship between virtual and physical memory is meaningful and users can rely on overcommit protection. OOM kills can not be avoided without wasting enormous amounts of memory this way. This shifts the management burden toward userspace, toward applications monitoring their environment and adjusting their operations. And so where statistics like /proc/meminfo used to be more informational, we have more and more software relying on them to make automated decisions based on utilization. But if userspace is supposed to take over responsibility, it needs a clear and accurate kernel interface to base its judgement on. And one of the requirements is certainly that memory consumers with wildly different reclaimability are not conflated. Adding MemAvailable is a good step in that direction, but there is software like Sigar[1] in circulation that might not get updated anytime soon. And even then, new users will continue to go for the intuitive interpretation of the Cached item. We can't blame them. There are years of tradition behind it, starting with the way free(1) and vmstat(8) have always reported free, buffers, cached. And try as we might, using "Cached" for unevictable memory is never going to be obvious. The semantics of Cached including shmem and kernel pages have been this way forever, dictated by the single-LRU implementation rather than optimal semantics. So it's an uncomfortable proposal to change it now. But what other way to fix this for existing users? What other way to make the interface more intuitive for future users? And what could break by removing it now? I guess somebody who already subtracts Shmem from Cached. What are your thoughts on this? [1] https://github.com/hyperic/sigar/blob/master/src/os/linux/linux_sigar.c#L323 --- fs/proc/meminfo.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index df4661abadc4..e19126be1dca 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -43,14 +43,14 @@ static int meminfo_proc_show(struct seq_file *m, void *v) si_swapinfo(&i); committed = percpu_counter_read_positive(&vm_committed_as); - cached = global_page_state(NR_FILE_PAGES) - - total_swapcache_pages() - i.bufferram; - if (cached < 0) - cached = 0; - for (lru = LRU_BASE; lru < NR_LRU_LISTS; lru++) pages[lru] = global_page_state(NR_LRU_BASE + lru); + cached = pages[LRU_ACTIVE_FILE] + pages[LRU_INACTIVE_FILE]; + cached -= i.bufferram; + if (cached < 0) + cached = 0; + for_each_zone(zone) wmark_low += zone->watermark[WMARK_LOW]; -- 2.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>