On Mon, Jan 29, 2024 at 02:52:23PM -0800, Greg KH wrote: > On Mon, Jan 29, 2024 at 02:42:04PM -0800, Sourav Panda wrote: > > Adds two new per-node fields, namely nr_page_metadata and > > nr_page_metadata_boot, to /sys/devices/system/node/nodeN/vmstat > > and a global PageMetadata field to /proc/meminfo. This information can > > be used by users to see how much memory is being used by per-page > > metadata, which can vary depending on build configuration, machine > > architecture, and system use. > > > > Per-page metadata is the amount of memory that Linux needs in order to > > manage memory at the page granularity. The majority of such memory is > > used by "struct page" and "page_ext" data structures. In contrast to > > most other memory consumption statistics, per-page metadata might not > > be included in MemTotal. For example, MemTotal does not include memblock > > allocations but includes buddy allocations. In this patch, exported > > field nr_page_metadata in /sys/devices/system/node/nodeN/vmstat would > > exclusively track buddy allocations while nr_page_metadata_boot would > > exclusively track memblock allocations. Furthermore, PageMetadata in > > /proc/meminfo would exclusively track buddy allocations allowing it to > > be compared against MemTotal. > > > > This memory depends on build configurations, machine architectures, and > > the way system is used: > > > > Build configuration may include extra fields into "struct page", > > and enable / disable "page_ext" > > Machine architecture defines base page sizes. For example 4K x86, > > 8K SPARC, 64K ARM64 (optionally), etc. The per-page metadata > > overhead is smaller on machines with larger page sizes. > > System use can change per-page overhead by using vmemmap > > optimizations with hugetlb pages, and emulated pmem devdax pages. > > Also, boot parameters can determine whether page_ext is needed > > to be allocated. This memory can be part of MemTotal or be outside > > MemTotal depending on whether the memory was hot-plugged, booted with, > > or hugetlb memory was returned back to the system. > > > > Suggested-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> > > Signed-off-by: Sourav Panda <souravpanda@xxxxxxxxxx> > > --- > > Documentation/filesystems/proc.rst | 3 +++ > > fs/proc/meminfo.c | 4 ++++ > > include/linux/mmzone.h | 4 ++++ > > include/linux/vmstat.h | 4 ++++ > > mm/hugetlb_vmemmap.c | 19 ++++++++++++++---- > > mm/mm_init.c | 3 +++ > > mm/page_alloc.c | 1 + > > mm/page_ext.c | 32 +++++++++++++++++++++--------- > > mm/sparse-vmemmap.c | 8 ++++++++ > > mm/sparse.c | 7 ++++++- > > mm/vmstat.c | 26 +++++++++++++++++++++++- > > 11 files changed, 96 insertions(+), 15 deletions(-) > > > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > > index 49ef12df631b..d5901d04e082 100644 > > --- a/Documentation/filesystems/proc.rst > > +++ b/Documentation/filesystems/proc.rst > > @@ -993,6 +993,7 @@ Example output. You may not have all of these fields. > > AnonPages: 4654780 kB > > Mapped: 266244 kB > > Shmem: 9976 kB > > + PageMetadata: 513419 kB > > KReclaimable: 517708 kB > > Slab: 660044 kB > > SReclaimable: 517708 kB > > Why are you adding it to the middle of the file? Are you sure the > userspace tools that parse this file today can handle an unknown field > here, and not just at the end of the file? FWIW, looking at git blame for fs/proc/meminfo.c, it seems like people have generally been adding items where it makes sense semantically, not at the end of the file. So maybe that's okay for userspace tools.