Hi everybody, I'd like to discuss the feasibility of a stat similar to si_mem_available() but at memcg scope which would specify how much memory can be charged without I/O. The si_mem_available() stat is based on heuristics so this does not provide an exact quantity that is actually available at any given time, but can otherwise provide userspace with some guidance on the amount of reclaimable memory. See the description in Documentation/filesystems/proc.rst and its implementation. [ Naturally, userspace would need to understand both the amount of memory that is available for allocation and for charging, separately, on an overcommitted system. I assume this is trivial. (Why don't we provide MemAvailable in per-node meminfo?) ] For such a stat at memcg scope, we can ignore totalreserves and watermarks. We already have ~precise (modulo MEMCG_CHARGE_BATCH) data for both file pages and slab_reclaimable. We can infer lazily free memory by doing file - (active_file + inactive_file) (This is necessary because lazy free memory is anon but on the inactive file lru and we can't infer lazy freeable memory through pglazyfree - pglazyfreed, they are event counters.) We can also infer the number of underlying compound pages that are on deferred split queues but have yet to be split with active_anon - anon (or is this a bug? :) So it *seems* like userspace can make a si_mem_available()-like calculation ("avail") by doing free = memory.high - memory.current lazyfree = file - (active_file + inactive_file) deferred = active_anon - anon avail = free + lazyfree + deferred + (active_file + inactive_file + slab_reclaimable) / 2 For userspace interested in knowing how much memory it can charge without incurring I/O (and assuming it has knowledge of available memory on an overcommitted system), it seems like: (a) it can derive the above avail amount that is at least similar to MemAvailable, (b) it can assume that all reclaim is considered equal so anything more than memory.high - memory.current is disruptive enough that it's a better heuristic than the above, or (c) the kernel provide an "avail" stat in memory.stat based on the above and can evolve as the kernel implementation changes (how lazy free memory impacts anon vs file lru stats, how deferred split memory is handled, any future extensions for "easily reclaimable memory") that userspace can count on to the same degree it can count on MemAvailable. Any thoughts?