On 8/7/23 23:15, David Hildenbrand wrote:
On 06.08.23 09:48, Xueshi Hu wrote:
There are currently three 'nr_hugepages' used to export the number of
huge
pages:
1. /proc/sys/vm/nr_hugepages
2. /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
3. /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
For consistency, all three 'nr_hugepages' should return the total number
of huge pages. When written, the number of persistent huge pages will be
adjusted to the specified value.
But, /proc/sys/vm/nr_hugepages returns the number of persistent huge
pages.
But that's documented behavior, no?
Documentation/admin-guide/mm/hugetlbpage.rst
``/proc/sys/vm/nr_hugepages`` indicates the current number of
"persistent" huge
pages in the kernel's huge page pool. "Persistent" huge pages will be
returned to the huge page pool when freed by a task. A user with root
privileges can dynamically allocate more or free some persistent huge pages
by increasing or decreasing the value of ``nr_hugepages``.
Actually, Documentation/admin-guide/mm/hugetlbpage.rst is contradictory
about the definition of /proc/sys/vm/nr_hugepages.
The documentation says:
- ``/proc/sys/vm/nr_hugepages`` indicates the current number of
"persistent" huge.
But, the documentation also says:
- The ``/proc`` interfaces discussed above have been retained for
backwards compatibility.
- The ``nr_hugepages`` attribute returns the total number of huge pages on
the specified node. When this attribute is written, the number of
persistent huge pages on the parent node will be adjusted to the specified
value, if sufficient resources exist, regardless of the task's mempolicy
or cpuset constraints.
So, I create the patch 4 to make the documentation more clear.
If such subtle inconsistencies result in unexpected behavior, it can be
challenging for a system administrator to detect.
Thanks,
Hu