On Thu, Dec 16, 2021 at 11:31:36PM +0800, Qi Zheng wrote: > > > On 12/16/21 9:16 PM, Greg KH wrote: > > On Thu, Dec 16, 2021 at 08:46:54PM +0800, Qi Zheng wrote: > > > In /proc/meminfo, we can show the sum of all the available memory > > > as "MemAvailable". Add the same counter also to per-node meminfo > > > under /sys. > > > > > > With this counter, some processes that bind nodes can make some > > > decisions by reading the "MemAvailable" of the corresponding nodes > > > directly. > > > > > > Signed-off-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> > > > --- > > > drivers/base/node.c | 4 ++++ > > > include/linux/mm.h | 1 + > > > include/linux/mmzone.h | 5 +++++ > > > mm/page_alloc.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ > > > 4 files changed, 59 insertions(+) > > > > > > diff --git a/drivers/base/node.c b/drivers/base/node.c > > > index 87acc47e8951..deb2a7965ae4 100644 > > > --- a/drivers/base/node.c > > > +++ b/drivers/base/node.c > > > @@ -375,8 +375,10 @@ static ssize_t node_read_meminfo(struct device *dev, > > > struct sysinfo i; > > > unsigned long sreclaimable, sunreclaimable; > > > unsigned long swapcached = 0; > > > + long available; > > > si_meminfo_node(&i, nid); > > > + available = si_mem_available_node(&i, nid); > > > sreclaimable = node_page_state_pages(pgdat, NR_SLAB_RECLAIMABLE_B); > > > sunreclaimable = node_page_state_pages(pgdat, NR_SLAB_UNRECLAIMABLE_B); > > > #ifdef CONFIG_SWAP > > > @@ -386,6 +388,7 @@ static ssize_t node_read_meminfo(struct device *dev, > > > "Node %d MemTotal: %8lu kB\n" > > > "Node %d MemFree: %8lu kB\n" > > > "Node %d MemUsed: %8lu kB\n" > > > + "Node %d MemAvailable: %8lu kB\n" > > > > You just changed a user/kernel api without documenting it anywhere, or > > ensuring that you did not just break anything. > > Hi greg k-h, > > The MemAvailable has long existed in the /proc/meminfo, it's meaning > has been described in the Documentation/filesystems/proc.rst. Since > the semantics of per-node MemAvailable has not changed, so I did not > add a new document description. This is not a proc file, it is in sysfs. And it violates all of the sysfs rules, and has all of the problems that proc files have. So the worst of both worlds :( > > Also, this api is crazy, and not ok, please never add anything new to > > it, it is broken as-is. > > The consideration of adding per-node MemAvailable is that some processes > that bind nodes need this information to do some decisions. > > Now their approach is to read other information in per-node meminfo > and /proc/sys/vm/watermark_scale_factor, and then approximate this > value. With this counter, they can directly read > /sys/devices/system/node/node*/meminfo to get the MemAvailable > information of each node. > > And MemTotal, MemFree and SReclaimable(etc.) all have corresponding > per-node versions, so I think that adding per-node MemAvailable might > also make sense. :) Please no, I do not want new things added to this file, as you might break parsers of this file. Also, again, you did not document this at all in Documentation/ABI/ so for that reason alone it is not ok. thanks, greg k-h