On 28/06/2024 14:07, Lance Yang wrote: > This commit introduces documentation for mTHP split counters in > transhuge.rst. > > Signed-off-by: Mingzhe Yang <mingzhe.yang@xxxxxx> > Signed-off-by: Lance Yang <ioworker0@xxxxxxxxx> > --- > Documentation/admin-guide/mm/transhuge.rst | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > index 1f72b00af5d3..709fe10b60f4 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -514,6 +514,22 @@ file_fallback_charge > falls back to using small pages even though the allocation was > successful. I note at the top of this section there is a note: Monitoring usage ================ .. note:: Currently the below counters only record events relating to PMD-sized THP. Events relating to other THP sizes are not included. Which is out of date, now that we support mTHP stats. Perhaps it should be removed? > > +split > + is incremented every time a huge page is successfully split into > + base pages. This can happen for a variety of reasons but a common > + reason is that a huge page is old and is being reclaimed. > + This action implies splitting any block mappings into PTEs. Now that I'm reading this, I'm reminded that Yang Shi suggested at LSFMM that a potential aid so solving the swap-out fragmentation problem is to split high orders to lower (but not 0) orders. I don't know if we would take that route, but in principle it sounds like splitting mTHP to smaller mTHP might be something we want some day. I wonder if we should spec this counter to also include splits to smaller orders and not just splits to base pages? Actually looking at the code, I think split_huge_page_to_list_to_order(order>0) would already increment this counter without actually splitting to base pages. So the documantation should probably just reflect that. > + > +split_failed > + is incremented if kernel fails to split huge > + page. This can happen if the page was pinned by somebody. > + > +split_deferred > + is incremented when a huge page is put onto split > + queue. This happens when a huge page is partially unmapped and > + splitting it would free up some memory. Pages on split queue are > + going to be split under memory pressure. > + > As the system ages, allocating huge pages may be expensive as the > system uses memory compaction to copy data around memory to free a > huge page for use. There are some counters in ``/proc/vmstat`` to help